Compound Arrays for Sample Profiling

ABSTRACT

The invention provides arrays of compound for use in profiling samples. The arrays include compounds bind to components of the samples at relatively low affinities. The avidity of compounds binding to components of the samples can be increased by forming arrays such that multivalent components of the samples (e.g., antibodies or cells) can bind to more than one molecule of a compound at the same time. When a sample is applied to an array under such conditions, the compounds of the array bind to component(s) of the sample with significantly different avidities generating a profile characteristic of the sample.

CROSS-REFERENCE TO RELATED CASES

The present application claims priority from U.S. Ser. No. 61/218,890filed Jun. 19, 2009 and U.S. Ser. No. 61/249,147 filed October 6, whichare incorporated by reference in their entirety for all purposes.

BACKGROUND OF THE INVENTION

ELISA reactions and other simple immunoassays are commonly used fordiagnosing disease. Many assays are configured to detect a singleanalyte. Therefore, when several differential diagnoses are possible,several different such assays are often conducted in parallel.

Existing approaches for broadly characterizing an immune responseinvolve multiple standard ELISAs, use of library panning involvingmultiple rounds of selection, or printing of known proteins frompathogens or host proteins in an array format to detect antibodies topathogens or autoantibodies. T-cells and B-cells have also beencharacterized by isolating and cloning specific regions of the T- andB-cell genome to sequence the recombination event. All of theseprocesses are labor intensive and take time. They are not conducive to astandard clinical diagnostic protocol or early detection before specificanalytes for which specific binding reagents are provided.

SUMMARY OF THE CLAIMED INVENTION

The invention provides methods of analyzing a sample, comprising: (a)contacting the sample with an array of immobilized different compoundsoccupying different areas of the array, wherein different molecules ofthe same compound within an area are spaced sufficiently proximate toone another for multivalent binding between at least two of thedifferent molecules in the same area and a multivalent binding partner;and (b) detecting binding of the different compounds in the array tocomponent(s) of the sample, such as antibodies. The sample can thus becharacterized from the relative binding of compounds whose binding tothe sample would be difficult to distinguish from each other andnonspecific binding under conditions of monovalent binding but whichshow significantly different binding from each other and nonspecificbinding due to multivalent binding of the sample to the compoundsthereby generating a binding profile characteristic of the sample.

In some methods, the sample is characterized from the relative binding aplurality of the sample that binds to the different compounds binding tothe sample with association constants of 1 mm to 1 μM. In some methods,the sample is characterized from the relative binding of at least 10 or100 compounds binding to the sample with association constants of 1 mmto 1 μM. In some methods, the sample is characterized by comparing abinding profile of the sample that includes the relative binding of theplurality of compounds with a reference binding profile. In somemethods, the average spacing between molecules of a compound in an areaof the array is less than 6 nm.

Optionally, the method further comprises identifying a component of thesample that binds to the different compounds. Optionally, the methodfurther comprises detecting the identified component with a bindingpartner known to bind the component. Optionally, the identifiedcomponent is detected with a plurality of different binding partnersknown to bind the component. Optionally, the binding partner is anantibody to the identified component. Optionally, the binding partner isa peptide known to bind the identified component. Optionally, thebinding partner is one of the different compounds detected in step (b).Optionally, the binding partner is a synbody. Optionally, the bindingpartner is immobilized to a support. Optionally, the binding partner isimmobilized to a support in an array. Optionally, the method furthercomprises forming a second array, the second array containing one ormore of the different compounds in the array binding to identifiedcomponent not all of the different compounds in the array. Optionally,the second array contains less than 5% of the different compounds in thearray. Optionally, the method further comprises forming an array orother device comprising compounds determined to bind to the sample butnot all of the different compounds in the array.

In some methods, the detecting step detects binding of the differentcompounds to an antibody or antibodies in the sample. In some methods,the detecting step detects binding of the different compounds to abiological entity displaying multiple copies of a protein from its outersurface. In some methods, the biological entity is cell displayingmultiple copies of receptor from its outersurface. In some methods, thecompound having strongest binding to the sample binds to the componentof the sample to which it has strongest binding affinity with an avidityof 0.1 μM to 1 mM. In some methods, the compounds are peptides or smallmolecules. In some methods, the array has 500-50,000 peptides. In somemethods, the peptides are 10-30 amino acid long. In some methods, thesequences of the peptides are randomly selected. In some methods, thedifferent immobilized compounds are selected without regard to thesample and the array further comprises a plurality of compounds known tobind different proteins also occupying different areas of the array. Insome methods, the plurality of compounds known to bind differentproteins includes compounds known to bind at least 25%, 50 or 75% ofdifferent human proteins. In some methods, the different immobilizedcompounds including a plurality of compounds known to bind at least 25%,50% or 75% of different human proteins and 500-50,000 random peptides.In some methods, the sequences of the peptides have less than 90%sequence identity to a known binding partner of the target. In somemethods, the sequences of the peptides have less than 90% sequenceidentity to known proteins. In some methods, the average spacing betweencompounds in an area of the array is 2-4 nm. In some methods, theaverage spacing between compounds in an area of the array is 3 nm. Insome methods, the contacting of the sample to the array is performed inthe presence of a potential competitor of binding of the sample to thearray. In some methods, the competitor is a known binding partner of asuspected component of the sample.

In some methods, the sample is a patient sample, and the competitor is aprotein known to be associated with a disease affecting the patient. Insome methods, the sample is a patient sample. In some methods, thesample contains a plurality of antibodies. In some methods, the patientis known or suspected to be suffering from a disease. In some methods,the patient is known to be at risk of a disease but is not showingsymptoms of the disease. In some the disease is an autoimmune disease,an infectious disease, or a disease of the CNS. In some methods, thesample is a blood, urine, or CNS sample. In some methods, component(s)of the sample are labeled. In some methods, binding of the peptides tocomponent(s) of the sample is detected using a secondary antibody. Insome methods, the secondary antibody is an isotype-specific antibody. Insome methods, binding of the peptides to component(s) of the sample isdetected by spr or mass spectrometry.

Some methods further comprise affinity purifying a component of thesample using a peptide determined to bind to the sample. Some methodsfurther comprise washing unbound component(s) of the sample from thearray, and dissociating bound component(s) from the array. Some methodsfurther comprise preparing an antibody library from the patient andusing a peptide to which an antibody in the sample binds as an affinityreagent to screen the library. Some methods further comprise identifyinga natural binding partner of the affinity purified antibody. Somemethods further comprise comparing the sequence(s) of peptide(s) bindingto the component(s) of the sample to a database of natural sequences toidentify natural binding partner(s) of components of the sample. Somemethods further comprise comparing a profile of different compoundsbinding to the sample with profiles of the different compoundsassociated with different diseases or different stages of a disease todiagnose a patient as having one of the diseases or stages of disease.Some methods further comprise comparing a profile of different compoundsbinding to the sample with a profile of the different compoundsassociated with lack of a disease to determine whether a disease ispresent. Some methods further comprise repeating the method fordifferent samples from a plurality of patients with the same disease todevelop a binding profile characteristic of the disease. Some methodsfurther comprise repeating the method for different samples from aplurality of patients with different disease to develop a plurality ofbinding profiles characteristic of different diseases.

The invention further provides an array of immobilized differentcompounds occupying different areas of the array, wherein differentmolecules of the same compound within an area are spaced sufficientlyproximate to one another for multivalent binding between at least two ofthe different molecules in the same area and a multivalent bindingpartner.

The invention further provides methods of analyzing a sample,comprising: contacting the sample with an array of immobilized differentpeptides occupying different areas of the array; detecting binding ofthe different peptides in the array to component(s) of the sample; andcharacterizing the sample from the relative binding of a plurality ofpeptides with apparent dissociation constants between 1 mM and 1 μM forthe sample. Optionally, the sample is characterized from the relativebinding of at least ten peptides with apparent dissociation constantsbetween 1 mM and 1 μM.

The invention further provides a method of characterizing a plurality ofdifferent samples, comprising contacting the different samples with thesame array or copies of the same array of immobilized different peptidesoccupying different areas of the array; and detecting different bindingprofiles of the different peptides to the different samples; wherein thesamples are characterized from their respective binding patterns.Optionally, the plurality of different samples includes samples frompatients with different disease symptoms. Optionally, the plurality ofdifferent samples includes samples of patients presenting with diseaseand lack of disease.

The invention further provides a method of analyzing a sample,comprising: contacting the sample with an array of immobilized differentcompounds occupying different areas of the array, wherein differentmolecules of the same compound within an area are spaced at an averagedistance of less than 4 nm apart in the same area; and detecting bindingof the different compounds in the array to component(s) of the sample.

The invention further provides a method of analyzing a sample,comprising

(i) contacting a sample with a known binding partner of a component ofthe sample; and(ii) determining whether the binding partner binds to the samplecompared with a control lacking the component; wherein the known bindingpartner is identified by a process comprising (a) contacting an initialsample with an array of immobilized different compounds occupyingdifferent areas of the array, wherein different molecules of the samecompound within an area are spaced sufficiently proximate to one anotherfor multivalent binding between at least two of the different moleculesin the same area and a multivalent binding partner; and (b) detectingbinding of the different compounds in the array to component(s) of thesample; (c) identifying a component of the sample that binds to thedifferent compounds; and (d) identifying a known binding partner of thecomponent for use in step (i).

Optionally, the component in step (i) is detected with a plurality ofdifferent binding partners known to bind the component. Optionally, thebinding partner is an antibody to the component. Optionally, the bindingpartner is a peptide known to bind the component. Optionally, thebinding partner is one of the different compounds detected in step (b).Optionally, the binding partner is the binding partner is immobilized toa support in an array in step (i).

The invention further provides a method of manufacturing a device foruse in detecting a component of a sample comprising: (a) contacting thesample with an array of immobilized different compounds occupyingdifferent areas of the array, wherein different molecules of the samecompound within an area are spaced sufficiently proximate to one anotherfor multivalent binding between at least two of the different moleculesin the same area and a multivalent binding partner; and (b) detectingbinding of the different compounds in the array to component(s) of thesample; and. (c) forming a device including a known binding partner of acomponent to which binding of the different compounds is detected instep (b).

Some methods further comprise identifying the component of the samplethat binds to the different compounds. Some methods further compriseforming the device including a plurality of different binding partnersknown to bind the component. Optionally, the binding partner is anantibody to the component. Optionally, the binding partner is a peptideknown to bind the component. Optionally, the binding partner is one ofthe different compounds detected in step (b). Optionally, the bindingpartner is a synbody. Optionally, the binding partner is immobilized toa support. Optionally, the binding partner is immobilized to a supportin an array. Optionally, step (c) comprises forming a second array, thesecond array containing one or more of the different compounds in thearray binding t the second array contains less than 5% of the differentcompounds in the array.

The invention further provides a method of testing a vaccine, comprisingcontacting a blood sample of a subject immunized with a vaccine againsta pathogenic microorganism with an array of immobilized differentcompounds occupying different areas of the array; detecting a pattern ofbinding of the sample to the different compounds in the array; andcomparing the pattern of binding to the pattern of binding of one ormore reference samples, wherein the reference samples are from subjectswho have survived an infection with the virus, similarity of bindingprofile between the subject and reference samples providing anindication the vaccine is effective against the pathogenicmicroorganism. Optionally, the subjects have been immunized with avaccine before exposure to the virus.

The invention further provides an array of immobilized differentcompounds occupying different areas of the array, wherein the differentcompounds include a plurality of compounds known to bind at least 25, 50or 75% of the known human proteins and 500-1,000,000 or more randompeptides, wherein different molecules of the same peptide within an areaare spaced sufficiently proximate to one another for multivalent bindingbetween at least two of the different molecules in the same area and amultivalent binding partner

The invention further provides a method of analyzing a sample,comprising contacting the sample with an array of immobilized differentcompounds occupying different areas of the arrays; detecting binding ofthe different compounds in the array to component(s) of the sample; andcharacterizing the sample from the relative binding of a plurality ofcompounds having binding strengths to the sample greater than but withinthree orders of magnitude of the mean plus three standard deviations ofthe binding strength of empty areas in the array. Optionally, thecharacterizing comprising comparing a binding profile of the sampleincluding the plurality of compounds with a reference binding profileincluding the plurality of compounds. Optionally, the sample ischaracterized from the relative binding of at least 10 or 100 compoundshaving binding strengths to the sample greater than but within threeorders of magnitude of the mean plus three standard deviations of thebinding strength of empty areas in the array.

BRIEF DESCRIPTIONS OF THE FIGURES

FIG. 1 shows binding of 800 peptides to various antibodies and thedegree of separation of these antibodies.

FIG. 2 shows the signal from peptides that bind to monoclonal antibodiesin the presence or absence of competitor.

FIG. 3 shows the binding intensities to the 300 most informativepeptides that distinguish mixtures of two different monoclonalantibodies with and without competitor. Different levels of white toblack indicate strength of binding of peptide to antibody.

FIG. 4 shows the binding profiles of patients with either Valley Fever,Influenza, Influenza Vaccine recipients, of healthy volunteersindicating the distinct pattern of binding specific to a disease for the100 informative peptides shown.

FIG. 5 shows the binding profile of the 88 most informative peptidesthat distinguish Asthma, Valley Fever (Cocci), Sarcoma, Breast Cancer,Glioma, Pancreatic Cancer, Pancreatitis, and healthy volunteers. Thelower panel shows the principal components map for the same peptides andprovides a view of the degree of separation between samples (in 2dimensions).

FIG. 6 shows principal component analysis of patients having or at riskof pancreatic cancer in the context of other diseases (Breast Cancer,Sarcoma, Pancreatitis, Glioma).

FIG. 7 compares binding profiles of normal volunteers, Breast Cancerpatients, and patients at risk of breast cancer who clinically have hada second, different primary tumor following an initial remission ofBreast Cancer.

FIG. 8 shows a pattern for influenza in mice blocked using whole virusparticles pre-adsorbed to antisera from infected mice. This same patternwas not blocked by an irrelevant virus.

FIGS. 9A, B and C show that an antibody pulled-down using peptides fromthe immunosignaturing microarray can detect the influenza particles.FIG. 9A is that antibody detecting PR8 particles. FIG. 9B is a positivecontrol showing where the antibody is detecting influenza particles.FIG. 9C is a negative control pull-down from the beads alone.

FIG. 10A shows a hierarchical clustering of 875 different individualsamples and ˜2000 different peptides indicating that patients withdifferent diseases show a common pattern of binding per disease. FIG.10B shows principal component maps for the same peptides and provides aview of the degree of separation between samples.

FIG. 11 shows an analyses of the number of different peptides binding todifferent numbers of antibodies.

FIG. 12 compares the binding profile of several monoclonal andpolyclonal antibodies to different peptides in an array.

DEFINITIONS

Specific binding refers to the binding of a compound to a target (e.g.,a component of a sample) that is detectably higher in magnitude anddistinguishable from non-specific binding occurring to at least oneunrelated target. Specific binding can be the result of formation ofbonds between particular functional groups or particular spatial fit(e.g., lock and key type) whereas nonspecific binding is usually theresult of van der Waals forces. Specific binding does not however implythat a compound binds one and only one target. Thus, a compound can andoften does show specific binding of different strengths to severaldifferent targets and only nonspecific binding to other targets.Preferably, different degrees of specific binding can be distinguishedfrom one another as can specific binding from nonspecific binding.Specific binding often involves an apparent association constant of 10³M⁻¹ or higher. Specific binding can additionally or alternatively bedefined as a binding strength (e.g., fluorescence intensity) more thanthree standard deviations greater than background represented by themean binding strength of empty control areas in an array (i.e., havingno compound, where any binding is nonspecific binding to the support).Most informative compounds have binding strengths within 1000-fold ofthree standard deviations of the background level. The range ofaffinities or avidities of compounds showing specific binding to amonoclonal or other sample can vary by 1-4 and often 2.5-3.5 orders ofmagnitude. An apparent association constant includes avidity effects(sometimes also known as cooperative binding) if present (in otherwords, if a target shows multivalent binding to multiple molecules ofthe same compound the apparent association constant is a valuereflecting the aggregate binding of the multiple molecules of the samecompound to the target). The theoretical maximum of the avidity is theproduct of the multiple individual dissociation constants, but inpractice the avidity is usually a value between the association constantof individual bonds and the theoretical maximum. When contacted with arandom selection of monoclonal antibodies, a subset of informativecompounds (e.g., 1-20 or 5-15%) have association constants in the rangeof 10³ to 10⁶ M⁻¹ or 2×10³ to 10⁶ M⁻¹ or 10⁴-10⁶ M⁻¹ to at least one andsometimes several (e.g., at least 2, 5 or 10) different targets. Asubset of all peptides or other compounds (e.g., at least 1%, at least5% or 10%, 1-75%, 5-60%, 1-20% or 5-15% usually shows actual affinityconstants of 10³-10⁶ M⁻¹ to at least one and usually several targets(e.g., at least 2, 5 or 10). The same ranges of association constantapply to composite targets binding to the same compound in a complexsample. Of course different compounds in an array have different degreesof binding strength to components of a sample and some compounds canbind with higher or lower apparent association constants than theseranges.

Patients include humans, veterinary animals, such as cats, dogs, horses,farm animals, such as chickens, pigs, sheep, cattle and laboratoryanimals, such as rodents, e.g., mice and rats.

A binding profile of an array is a measure of the amount of component(s)of a sample bound to the different compounds of an array to a particularsample. The amount of component(s) bound reflects the amount of thecomponents in the sample as well as the binding strength of componentsto the compounds. A binding profile can be represented for example as amatrix of binding strengths corresponding to the different compounds inan array. A binding profile typically includes binding strengths of aplurality of compounds (e.g., at least 2, 10, 50, 1000 or 1000 havingassociation constants in a range of 1 mM to 1 μM to a sample or within arange of greater than but within a factor of 1000 of three standarddeviations greater than the mean intensity of empty cells.

Binding strength can be measured by association constant, dissociationconstant, dissociation rate, or association rate, or a composite measureof stickiness which may include one or more of these measures. Thestrength of a signal from a labeled component of a sample bound toimmobilized compounds can provide a value for general stickiness. If aterm used to define binding strength is referred to as “apparent” whatis meant is a measured value without regard to multivalent bonding. Forexample, the measured value of an association constant under conditionsof multivalent bonding includes a plurality of effects due to monovalentbonding among other factors. Unless otherwise specified binding strengthcan refer to any of these measures referred to above.

The term “nucleic acids” includes any and all forms of alternativenucleic acid containing modified bases, sugars, and backbones includingpeptide nucleic acids and aptamers, optionally, with stem loopstructures.

The term “polypeptide” is used interchangeably with “peptide” and in itsbroadest sense to refer to a sequence of subunit natural amino acids,amino acid analogs including unnatural amino acids. Peptides includepolymers of amino acids having the formula H₂NCHRCOOH and/or analogamino acids having the formula HRNCH₂COOH. The subunits are linked bypeptide bonds (i.e., amide bonds), except as noted. Often all subunitsare connected by peptide bonds. The polypeptides may be naturallyoccurring, processed forms of naturally occurring polypeptides (such asby enzymatic digestion), chemically synthesized or recombinantlyexpressed. Preferably, the polypeptides are chemically synthesized usingstandard techniques. The polypeptides may comprise D-amino acids (whichare resistant to L-amino acid-specific proteases), a combination of D-and L-amino acids, β amino acids, and various other “designer” aminoacids (e.g., β-methyl amino acids, Cα-methyl amino acids, and Nα-methylamino acids) to convey special properties. Synthetic amino acids includeornithine for lysine, and norleucine for leucine or isoleucine. Hundredsof different amino acid analogs are commercially available from e.g.,PepTech Corp., MA. In general, unnatural amino acids have the same basicchemical structure as a naturally occurring amino acid, i.e., an acarbon that is bound to a hydrogen, a carboxyl group, an amino group,and an R group.

In addition, polypeptides can have non-peptide bonds, such asN-methylated bonds (—N(CH₃)—CO—), ester bonds (—C(R)H—C—O—O—C(R)—N—),ketomethylen bonds (—CO—CH₂—), aza bonds (—NH—N(R)—CO—), wherein R isany alkyl, e.g., methyl, carba bonds (—CH₂—NH—), hydroxyethylene bonds(—CH(OH)—CH₂—), thioamide bonds (—CS—NH—), olefinic double bonds(—CH═CH—), retro amide bonds (—NH—CO—), peptide derivatives(—N(R)—CH₂—CO—), wherein R is the “normal” side chain. Thesemodifications can occur at any of the bonds along the peptide chain andeven at several (2-3) at the same time. For example, a peptide caninclude an ester bond. A polypeptide can also incorporate a reducedpeptide bond, i.e., R₁—CH₂—NH—R₂, where R₁ and R₂ are amino acidresidues or sequences. A reduced peptide bond may be introduced as adipeptide subunit. Such a polypeptide would be resistant to proteaseactivity, and would possess an extended half-live in vivo. The compoundscan also be peptoids (N-substituted glycines), in which the sidechainsare appended to nitrogen atoms along the molecule's backbone, ratherthan to the α-carbons, as in amino acids.

The term “polysaccharide” means any polymer (homopolymer orheteropolymer) made of subunit monosaccharides, oligimers or modifiedmonosaccharides. The linkages between sugars can include acetal linkages(glycosidic bonds), ester linkages (including phophodiester linkages),amide linkages, and ether linkages.

DETAILED DESCRIPTION I. General

The invention provides arrays of compounds for use in profiling samples.The arrays include compounds binding to components of the samples atrelatively low affinities. Although practice of the invention is notdependent on an understanding of mechanism, it is believed that underconditions of monovalent binding, different degrees of specific bindingmight be difficult to distinguish from each other and from nonspecificbinding. However, the avidity of compounds binding to components of thesamples can be increased by forming arrays such that components of thesamples (e.g., antibodies or cells) can bind to more than one moleculeof a compound at the same time. When a sample is applied to an arrayunder such conditions, the compounds of the array bind to component(s)of the sample with significantly different affinities generating aprofile characteristic of the sample. Such a profile usually includessome compounds having no specific binding to components of the sampleand other compounds having different degrees of specific binding tocomponents of the sample. Although such binding interactions arespecific in the sense that overall binding profiles of an array arereproducible for replicates of the same sample and distinguishablebetween different samples, they are not necessarily unique in thatcompounds in the array usually show specific binding albeit of differentdegrees to a number of different components of a sample or differentsamples.

The avidity of informative compounds (i.e., those showing specificbinding) in an array can be measured for monoclonal antibody samples.When measured against monoclonal antibodies selected from random (e.g.,purchased from a commercial supplier as described in the Examples),informative compounds in some arrays often show apparent affinityassociation constants in a range of 10⁴-10⁹, 10⁶-10⁹, 10⁴-10⁷ 10⁴-10⁶M⁻¹. Association constants of such informative compounds are oftenwithin a range of 10³-10⁶ M⁻¹ or 10⁴-10⁵ M⁻¹. When measured against acomplex sample, similar ranges of apparent or actual associationconstants are observed; however, in this case, the constant are acomposite of values for multiple different components within a samplebinding to the same compound. Such affinities can be distinguished fromnonspecific interactions. The proportion of informative compounds (i.e.,compounds that show distinguishable binding among different targets) canvary depending on the composition of the array and the sample, butranges of 1-75%, 5-60%, 1-20%, 5-15%, or 7-12% provide some guide Giventhe data in Example 1 showing that different monoclonal antibodies havetheir own signature, it might have seemed impossible to meaningfullyresolve patient samples which may contain 10⁸ or more differentspecificities of antibodies in the serum the array would beunresolveable. When an array is hybridized against a more complexsample, such as from a patient, the binding profile represents theaggregate effect of multiple components of a sample. Surprisinglydespite the complexity of the samples, different samples are associatedwith different binding profiles. Also surprisingly, the intensity ofbinding profile often differs between patients with a disease or at riskof disease relative to normal patients. Relatively more compounds areinformative for disease patients or patients at risk of disease relativeto normals and binding intensities are relatively higher (e.g., biasedtoward the higher end of the range for a random selection ofmonoclonals) than for the normal patients (intensities biased toward thelower end of the range for a random selection of monoclonals).

The binding profile of such an array to a sample can be used tocharacterize a sample. For example, the binding profile can be comparedwith binding profiles known to be associated with different diseases orstages of diseases or lack of diseases. Alternatively or additionally,the binding an be analyzed, for example, by using a compound bindingrelatively strongly to a component of the sample to affinity purify anantibody from the sample, or by comparing the sequence of a peptide inthe array known to bind strongly to a component of a sample with aprotein database to identify a protein in the sample. Remarkably, thesame array can generate different and informative profiles with manydifferent samples representing different disease states, disease stages,lack of disease and the like. Moreover, a profile characteristic ofdisease or departure from a nondisease state can be detected very earlyin development of a disease before typical analytical markers of diseasewould be detectable by conventional methods, such as ELISA.

II. Compounds for Use in Arrays

Many different classes of compounds or combinations of classes ofcompounds can be used for the arrays and methods of the invention.Classes of compounds include nucleic acids and their analogs,polypeptides (broadly defined as above), polysaccharides, organiccompounds, inorganic compounds, polymers, lipids, and combinationsthereof. Combinatorial libraries can be produced for many types ofcompounds that can be synthesized in a step-by-step fashion. Suchcompounds include polypeptides, beta-turn mimetics, polysaccharides,phospholipids, hormones, prostaglandins, steroids, aromatic compounds,heterocyclic compounds, benzodiazepines, oligomeric N-substitutedglycines and oligocarbamates. Large combinatorial libraries of thecompounds can be constructed by the encoded synthetic libraries (ESL)method described in Affymax, WO 95/12608, Affymax, WO 93/06121, ColumbiaUniversity, WO 94/08051, Pharmacopeia, WO 95/35503 and Scripps, WO95/30642 (each of which is incorporated by reference for all purposes).Peptide libraries can also be generated by phage display methods. See,e.g., Devlin, WO 91/18980. The test compounds can be natural orsynthetic. The test compounds can comprise or consist of linear orbranched heteropolymeric compounds based on any of a number of linkagesor combinations of linkages (e.g., amide, ester, ether, thiol, radicaladditions, and metal coordination), dendritic structures, circularstructures, cavity structures or other structures with multiple nearbysites of attachment that serve as scaffolds upon which specificadditions are made. The compounds can be naturally occurring ornonnaturally occurring. Many different classes of compounds other thannucleic acids can be used, but optionally if the compounds are nucleicacids, the sample components detected are not nucleic acids. In somearrays, the test compounds have a molecular weight of between 500 and10,000 Da, and optionally 1000 to 4000 Da.

The number of compounds is a balance between two factors. The morecompounds, the more likely an array to include members having detectableaffinity for any target of interest. However, a larger number ofcompounds also increases the cost of synthesizing and analyzing anarray. Arrays typically have at least 100 compounds. Arrays havingbetween 500 and 25000 compounds provide a good compromise betweenlikelihood of obtaining compounds with detectable binding to any targetof interest and ease of synthesis and analysis. Arrays having, forexample, 100 to 50,000 members or 500-500,000, or 1000-25,000 memberscan also be used. However, arrays having much larger numbers of membersfor example, 10²-10⁷ or 1000 to 5,000,000 or 500,000 to 2,000,000 canalso be used. Such arrays typically represent only a very smallproportion of total structural space, for example less than 10⁻⁶, 10⁻¹⁰,or 10⁻¹⁵ in the case of peptides. Sequence space means the total numberof permutations of sequence of a given set of monomers. For example, forthe set of 20 natural amino acids there are 20^(n) permutations, where nis the length of a peptide. Although it is widely assumed that most ifnot all of the residues in a peptide epitope participate in binding tothe a target, it is much more likely that between two and five residuesin a 10-12 mer epitope are involved in energetically favorableinteractions with the target, the other residues are simply there toadjust the positions of the important residues, and to preventinhibition of binding. Therefore, a relatively small number of peptidescan provide a good representation of total sequence space, and includemembers capable of specific albeit low affinity interactions with a widevariety of targets. For example, 500-25,000 random peptides can coverevenly the entire shape space of an immune system (10⁷ to 10⁸ antibodiesin humans).

More compounds in the array should allow higher resolution of thediversity of compounds in the complex sample. For example, an array of 1million compound would begin to approach the complexity of antibodies ina person's serum and therefore should allow more resolution of complexsamples. Yet, even with a much smaller number of compounds, one is ableto resolve new immune responses from infection or immunization.

For polymeric compounds, the lengths of polymers represent a compromisebetween binding affinity and ease of synthesis. There is somerelationship between peptide length and binding affinity with increasinglength increasing affinity. However, as peptide length increases thelikelihood of binding a binding site on a target that interacts with thefull peptide length decreases. Cost of synthesis also increases withincreasing length as does the likelihood of insolubility. For peptidearrays, peptides having 8-35, 12-35, 15-25 or 9-20 residues arepreferred. These ranges of monomer lengths can also be used for otherpolymers, although aptamers usually have longer lengths (e.g., up to 100nucleotides).

The compounds (e.g., all or at least 80, 90 or 95%) are typically chosenwithout regard to the identity of a particular target or naturalligand(s) to the target. In other words, the composition of an array istypically not chosen because of a priori knowledge that particularcompounds bind to a particular target or have significant sequenceidentity either with the target or known ligands thereto. A sequenceidentity between a peptide and a natural sequence (e.g., a target orligand) is considered significant if at least 30% of the residues in thepeptide are identical to corresponding residues in the natural sequencewhen maximally aligned as measured using a BLAST or BLAST 2.0 sequencecomparison algorithm with default parameters described below, or bymanual alignment and visual inspection (see, e.g., NCBI web sitencbi.nlm.nih.gov/BLAST or the like).

Some compounds are randomly selected from total sequence space or aportion thereof (e.g., peptides in which certain amino acids are absentor under-represented). Random selection can be completely random inwhich case any compound has an equal chance of being selected fromsequence space or partially random in which case the selection involvesrandom choices but is biased toward or against certain monomers, such asamino acids. Random selection of peptides can be made for example by apseudorandom computer algorithm. The randomization process can bedesigned such that different amino acids are equally represented in theresulting peptides, or occur in proportions representing those innature, or in any desired proportions. Often cysteine residues areomitted from library members with the possible exception of a terminalamino acid, which provides a point of attachment to a support. In somelibraries, certain amino acids are held constant in all peptides. Forexample, in some libraries, the three C-terminal amino acids areglycine, serine and cysteine with cysteine being the final amino acid atthe C-terminus. A library chosen by random selection, once selected isof known identity and can be reproduced without repeating the initialrandom selection process. Nevertheless, the compounds in such libraryretain the same random relations with one another. For example, thepeptides in random library that is subsequently reproduced retain arandom distribution throughout sequence space (with the possibleexception of cysteine residues, if this residue is omitted). Collectionsof compounds, such as peptides, that are randomly distributed oversequence space, are still considered random even if reproduced withoutrepeating the initial random selection.

The principles for selecting peptides and other compounds for arrays inthe present methods are analogous to those for selecting initiallibraries of compounds in producing synthetic antibodies, as furtherdescribed in WO/2008/048970 and WO2009/140039.

III. Making Arrays

Compounds can be presynthesized and spotted onto a surface of an arrayor can be synthesized in situ on an array surface (see, e.g., Cretich etal., Biomol. Eng. 2, 77-88 (2006); Min et al., Current Opinion inChemical Biology 8, 554-558 (2004), Breitling, Mol. BioSyst., 5, 224-234(2009), U.S. Pat. No. 5,143,854; EP 476, 014, Fodor et al., 1993, Nature364, 555-556; U.S. Pat. No. 5,571,639, U.S. Pat. No. 5,593,839, EP624,059, U.S. Pat. No. 6,620,584, EP 728,520). Customized arrays arealso commercially available from suppliers such as Invitrogen orPepscan. The surface is usually derivatized with a functional group thatattaches to the compounds, optionally via linker. Compounds can beattached via covalent or noncovalent linkages. The array surface can bea single contiguous surface of a support. Alternatively an array can beformed by spotting or synthesizing different compounds on differentparticulate supports, such as beads. Peptides can be attached in eitherorientation (N or C) relative to the array. In general, the differentcompounds occupy different areas of a contiguous array or differentparticles in a particulate array. The identity of which compoundoccupies which area of an array or which particle is usually eitherknown as a result of the synthesis process or determinable as a resultof an encoding process. Encoding processes are commonly used for beads.The different areas in a contiguous array can be immediately adjoiningas may arise when such arrays are the result of in situ synthesis, orseparated, which is often the result of spotting.

An area or cell of an array is a unit of surface area from which aseparate signal is detectable. In some arrays, each area of the array isoccupied only by molecules of the same compound except for possibly asmall degree of bleed over from one area to another, due for example, toimperfections in the array. In other arrays, some or all of the areascontain a pool of two or more different compounds. In such an array, thesignal from an area containing a pool of two or more different compoundsis the aggregate undivided signal from the compounds constituting thepool.

Such arrays typically contain from 100-5,000,000 compounds (e.g.,100-1,000,000, 500, 100,000 or 500-25,000 compounds) as discussed above.These numbers of compounds can readily be accommodated in differentregions of an array of the order of 1-5 cm² combined area.

Within any one area of a contiguous array or within anyone particle of aparticle array many different molecules of the same compound arepresent. Because compounds are usually attached to a derivatized surfaceof a support or particle (e.g., a support or particle bearing a linker),the density of molecules within an area of an array or a particle can becontrolled in part by the derivatization process, for example, theperiod of time and concentration of derivatizing agent used. The densityof molecules can also be controlled by the attachment or in situsynthesis process by which a compound is attached to a support. Thelength of a coupling cycle and concentration of compound used incoupling can both affect compound density.

The density of different molecules of a compound within an area of anarray or on a particle controls the average spacing between molecules ofa compound (or compounds in the case of a pooled array, which in turndetermines whether a compound is able to form multivalent bonds with amultivalent binding partner in a sample. If two molecules of a compoundor compounds in the case of a pooled array, are sufficiently proximateto one another, both molecules can bind to the same multivalent bindingpartner (for example to the two arms of an antibody). For peptides oflength 15-25 residues an average (mean) spacing of less than 0.1-6 nm,1-4 nm, 2-4 nm, e.g., 1, 2 or 3 nm is, for example, suitable to allowdifferent regions of the same compound to undergo such multivalentbonding. Average (e.g., mean) spacings are typically less than 6 nmbecause spacings of 6 nm or more are do not allow simultaneous bindingof two sites on the same target. For example, for peptides of lengths15-25 residues, the two identical binding sites of one antibody couldnot span more than 6 nm to contact two peptides at once. The optimumspacing for multivalent interactions may vary depending on the compoundsused and the components of the sample being analyzed.

The formation of multivalent bonds can be shown by several methods. Forexample, the binding of an array to an interact antibody (i.e., twobinding sites) can be compared with an otherwise identical antibodyfragment (e.g., a Fab fragment) having only one binding site. Strongerbinding to the intact antibody than the antibody fragment (e.g., higherapparent association constant) indicates multivalent binding.Multivalent binding can also be shown by comparing the binding of anarray of an immobilized compound to an intact antibody with two bindingsites with the reverse format in which the antibody is immobilized andthe compound is in solution. Stronger binding (e.g., higher apparentassociation constant) of the immobilized compound to the antibody insolution compared with immobilized antibody to the compound in solutionprovides an indication that the immobilized compound can formmultivalent bonds to the antibody. If capacity of compounds in array toform multivalent bonds in such a procedure is tested, it is usuallysufficient to test one or a few sample compounds from the array for suchbinding. If the compounds in the array are of similar type, e.g.,peptides of the same length, and deposited or synthesized under the sameconditions, it can be inferred that if one or a few compounds on anarray (e.g., 1-10%) are capable of multivalent binding, then so are theothers. It is also not necessary to test every array that is made.Association (i.e., affinity) constants of compounds can be measured byconventional methods using technologies like SPR, ELISA, Luminex andother solution-phase binding (e.g., monitoring changes in bound signalover time) when the antibody or other sample is immobilized and thecompound is in solution. Conversely, apparent association constants canbe measured when a compound is immobilized and antibody or other sampleis in solution. Once suitable synthesis or deposit conditions have beenestablished for achieving arrays capable of multivalent binding otherarrays can by made under the same conditions without individualizedtesting.

Usually, different compounds are deposited or synthesized in differentareas of an array under the same conditions, so that if one compound isspaced so that it is capable of multivalent binding, most or allcompounds are. In some arrays, at least 10%, 50%, 75%, 90% or 100% ofcompounds in the array are spaced so as to permit multivalent bindingwith a multivalent binding partner. However, it is not necessary thatall compounds be deposited or synthesized with the same spacing ofmolecules within an area of the array. For example, in some arrays, somecompounds are spaced further apart so as not to permit or permit onlyreduced multivalent binding compared with other compounds in an array.

The spacing can be measured experimentally under given conditions ofdeposition by depositing fluorescently labeled compounds and countingphotons emitted from an area of an array. The number of photons can berelated to the number of molecules of fluorescein in such an area and inturn the number of molecules of compound bearing the label (see, e.g.,U.S. Pat. No. 5,143,854). Alternatively, the spacing can be determinedby calculation taking into account the number of molecules depositedwithin an area of an array, coupling efficiency and maximum density offunctional groups, if any, to which compounds are being attached. Thespacing can also be determined by electron microscopy of an array.

Arrays having larger spacing that do not permit multivalent interactionsor do so to a reduced extent compared with spacing described above alsohave application in identifying high affinity interactions. This type ofstrategy can be used to identify peptides or other compounds, forexample, that are very close structurally to the original epitope thatraised the antibody response. Alternatively, for arrays of peptides fromlife space, this spacing facilitates identifying the true epitope.

The spacing between compounds can also be controlled using spacedarrays; that is, arrays on surfaces coated with nano-structures thatresult in more uniform spacing between compounds in an array. Forexample, NSB Postech amine slides coated with trillions of NanoConeapexes functionalized with primary amino groups spaced at 3-4 nm for adensity of 0.05-0.06 per nm² can be used.

Array formats that can be used include microarrays, beads, columns,dipsticks optical fibers, nitrocellulose, nylon, glass, quartz, mica,diazotized membranes (paper or nylon), silicones, polyformaldehyde,cellulose, cellulose acetate, paper, ceramics, metals, metalloids,semiconductive materials, quantum dots, coated beads, otherchromatographic materials, magnetic particles; plastics and otherorganic polymers such as polyethylene, polypropylene, and polystyrene;conducting polymers such as polypyrole and polyindole; micro ornanostructured surfaces, nanotube, nanowire, or nanoparticulatedecorated surfaces; or porous surfaces or gels such as methacrylates,acrylamides, sugar polymers, cellulose, silicates, and other fibrous orstranded polymers.

An exemplary method of array preparation is as follows. A microarray isprepared by robotically spotting distinct polypeptides on a glass slidehaving an aminosilane functionalized surface. Each polypeptide has aC-terminal glycine-serine-cysteine as the three C-terminal residues andthe remaining (17) residues determined by a pseudorandom computationalprocess in which each of the 20 naturally occurring amino acids exceptcysteine had an equal probability of being chosen at each position.Polypeptides are conjugated to the aminosilane surface by thiolattachment of the C-terminal cysteine of the polypeptide to a maleimide(sulfo-SMCC, sulfosuccinimidyl4-[N-maleimidomethyl]cyclohexane-1-carboxylate which is covalentlybonded to the aminosilane surface. The polypeptides are chemicallysynthesized, dissolved in dimethyl formamide at a concentration that mayrange from about 0.1 mg/ml to about 2 mg/ml, and then diluted 4:1 withphosphate-buffered saline prior to spotting. The concentration ofpeptide or other compound determines the average spacing between peptidemolecules within a region of the array. A concentration of 1 mg/ml givesan average spacing of about 0.5 nm. The spacing decreases non-linearlywith dilution at lower concentrations. The printed slides stored underan argon atmosphere at 4° C. until use.

An exemplary calculation of spacing is as follows: spot size: 150 μm,spot area: 17671 μm², nanoprint deposition volume: 200 pL, peptideconcentration: 1 mg/ml, deposition amount: 200 pg, # peptides deposited:8×10¹⁰ per spot, 8×10¹⁰ peptides/17671 μm²=4.5×10⁶ peptides/μm²,2.2×10⁻⁷ um² area needed by 1 peptide (4.6×10⁻⁴ μm spacing).

As well as including compounds randomly or without regard to the samplebeing analyzed, arrays can include other compounds known to bindparticular targets, such as proteins, in a sample. These compounds canbe antibodies, synbodies or peptides among others. Usually, suchinteractions are high affinity (e.g., greater than 10⁷, 10⁸ or 10⁹ M⁻¹).The number of such known binding partner compounds can be large, forexample, there can be a different compound for at least 25, 50, 75, or90% or substantially all of the known proteins expressed by a givengenome, such as the human genome). The different known binding partnercompounds occupy different areas of the array in similar fashion torandomly selected compounds. However, because the known binding partnercompounds are in general capable of high affinity interactions, they canbe used with or without an intermolecular spacing that permitsmultivalent interactions with the sample. Although one might think thatinclusion of compounds selected at random or without regard to thesample being analyzed would be redundant in view of inclusion of knownbinding proteins to a large part or all of the encoded proteins in agenome, such is not the case because some diagnostic immune responsesare the result of somatic mutation or non-protein components and notdetected by binding proteins to encoded proteins.

IV. Samples and Components to be Analyzed

The arrays and methods of the invention can be used for analyzing anykind of sample containing or potentially containing analyte(s) ofinterest. Of particular interest are samples from human or veterinarypatients or laboratory model animals. Such samples can be blood(including whole blood, red cells, plasma and the like), urine, feces,saliva, CNS fluid, other body fluids, hair, skin, biopsies and the like.A profile can be obtained from a small volume of sample, e.g., ≦1 μl.Some samples are from patients known or suspected to be suffering from adisease. The identity of the disease may or may not be known. Somesamples are obtained from patients known to have been subjected to arisk of disease but in which symptoms of disease are not yet evident.The risk can be genetic (e.g., a particular gene or family history) orexperiential (e.g., exposure to a toxic chemical or radiation). Samplescan also be obtained from patients who have been vaccinated to analyzethe resulting immune response.

Samples from patients can include a wide variety of components subjectto potential analysis by an array. The components most amenable todetection are those capable of multivalent bonding to compounds in thearray. Such components include antibodies, which can support multivalentbonding through their pairs of heavy and light chains (i.e., two bindingsites per antibody) and cells, which can form multiple bonds throughmultiple copies of receptors displayed from their outer surfaces.Viruses can also form multivalent bonds through different copies of coatproteins on their outer surface. Samples from patients can include manydifferent antibodies and/or different cells and/or other components.

Samples can be analyzed with little if any further processing or can besubject to further processing such that only selected components of thesample (e.g., antibodies or cells) are analyzed with the array.

V. Methods of Detection

Binding interactions between components of a sample and an array can bedetected in a variety of formats. In some formats, components of thesamples are labeled. The label can be a radioisotype or dye amongothers. The label can be supplied either by administering the label to apatient before obtaining a sample or by linking the label to the sampleor selective component(s) thereof.

Binding interactions can also be detected using a secondary detectionreagent, such as an antibody. For example, binding of antibodies in asample to an array can be detected using a secondary antibody specificfor the isotype of an antibody (e.g., IgG (including any of thesubtypes, such as IgG1, IgG2, IgG3 and IgG4), IgA, IgM). The secondaryantibody is usually labeled and can bind to all antibodies in the samplebeing analyzed of a particular isotype. Different secondary antibodiescan be used having different isotype specificities. Although there isoften substantial overlap in compounds bound by antibodies of differentisotypes in the same sample, there are also differences in profile.

Binding interactions can also be detected using label-free methods, suchas surface plasmon resonance (SPR) and mass spectrometry. SPR canprovide a measure of dissociation constants, and dissociation rates. TheA-100 Biocore/GE instrument, for example, is suitable for this type ofanalysis. FLEXchips can be used to analyze up to 400 binding reactionson the same support.

Optionally, binding interactions between component(s) of a sample andthe array can be detected in a competition format. A difference in thebinding profile of an array to a sample in the presence versus absenceof a competitive inhibitor of binding can be useful in characterizingthe sample. The competitive inhibitor can be for example, a knownprotein associated with a disease condition, such as pathogen orantibody to a pathogen. A reduction in binding of member(s) of the arrayto a sample in the presence of such a competitor provides an indicationthat the pathogen is present.

The stringency can be adjusted by varying the salts, ionic strength,organic solvent content and temperature at which library members arecontacted with the target.

VI. Applications

The arrays have a wide variety of applications in analyzing orcharacterizing clinical, veterinary, forensic, laboratory and othersamples. As with conventional diagnostics, the arrays can be used toidentify particular analytes within samples, for example, analytesassociated with particular disease. However, the methods can also beused to provide a binding profile of different compounds characterizinga sample. The binding profile represents the aggregate interactions ofthe compounds with different components in the sample, and can becharacteristic of a particular disease, stage of disease or lack ofdisease. The different components can be complex (e.g., at least 10,100, 1000 or 1,000,000 different antibodies and/or different cells).

A binding profile typically includes compounds whose interactions withthe sample are nonspecific as well as compounds whose interaction withthe sample reflect specific but low affinity interactions (i.e.,apparent or actual dissociation constant between 1 mM and 1 μM).Compounds with higher affinity interactions (i.e., dissociation constantless than 1 μM) may or may not be present. Such higher affinityinteractions if present may arise by chance as a result of a compound inthe array being a mimetic of a natural binding partner of a samplecomponent or as a result of including a control in which a compound is aknown binding partner of a component of a sample. However, a sample canusually be adequately characterized by the binding profile of compoundswith low affinity interactions with the sample, optionally incombination with compounds lacking specific binding to components of thesample. For example, the identity and relative binding of at least 2, 5,10 or 50 compounds capable of low affinity specific binding tocomponents of the sample can often be used to characterize the sample.Such low affinities actions may in part be the result of compoundsserving as mimetopes providing a linear epitope that (imperfectly)resemble an epitope against which an antibody in the same was raised(e.g., a complex 3D-structure).

One application lies in analyzing samples from patients known orsuspected to be suffering from a disease but in which the particulardisease affecting the patient is not known. A conventional approachwould be to perform separate assays for suspected diseases. By contrast,in the present methods, a single binding profile from the patient samplecan be used to characterize the patient for many diseases, stage ofdisease or lack of disease. The binding profile can be used tocharacterize the sample for virtually any disease, including autoimmunedisease, cancer, infectious diseases, and diseases of the CNS. Most ifnot all diseases involve some change s in antibodies, cells or othercomponents present in patient samples, reflected in a binding profile.Some exemplary infectious diseases include bacterial, fungal and viraldiseases, such as Valley Fever, Q-fever, Tularemia tularensis,Rickettsia rickettsii, HSV types I and II, HVB, HVC, CMV, Epstein Barrvirus, JC virus, influenza, A, B or C, adenovirus, and HIV. Becausedifferent infections give different profiles, different infections in apatient having multiple infections can be detected simultaneously. Someexemplary cancers that can be diagnosed or prognosed using the methodsof the invention include glioblastoma, breast cancer, multipleindependent primary cancer and/or recurrence situation, pancreaticcancer, lung cancer, myeloma, ovarian cancer and esophageal cancer.Precancerous cells that are morphological distinguishable from normalcells but not yet cancerous can also be detected using the methods ofthe invention. Neurological diseases, such Alzheimer's disease, althoughnot generally considered to be an autoimmune disease, results in somechanges in antibodies present in a sample. The same is the case forchronic diseases, such as Asthma, Rheumatoid arthritis, Diabetesmellitus type 1, Psoriasis, Multiple Sclerosis and others.

Another application lies in analyzing samples from patients known orsuspected to have a particular disease, but in which the stage, severityor prognosis for the disease is unclear. Again the binding profile canprovide an indication of any of these factors.

Another application lies in analyzing samples from vaccinated patientsto determine whether an adequate protective immune response isdeveloping. The pattern of response in one patient can be compared, forexample, with a patient who has been naturally infected with thepathogen and survived, a similarity of response pattern indicating thepatient is likely to survive and a dissimilarity that the patient willget worse or die at least in the absence of alternate treatment.Alternatively, a profile of a patient or animal model immunized with anew vaccine (for example in a clinical or preclinical trial) can becompared with profiles of patients or control animals immunized with anexisting vaccine known to be effective. In a further variation, patientsbeing recruited for a clinical trial of a vaccine can be prescreened forbinding profile. Those already having a binding profile similar to thatof a patient immunized with a vaccine known to be effective or from apatient who has survived a natural infection can be eliminated from thetrial because their inclusion might lead to a misleading placeboresponse.

Another application lies in screening samples from patients who haveundergone organ transplant (particularly allotransplantation). Theprofile in a patient under test can be compared with profiles ofpatients undergoing organ transplant who have or have not undergonerejection following the transplant. Similarity of the profile between apatient under test and a patient who has previously undergone rejection(or an average profile of a collection of such patients) indicates thatthe patient is at risk or is undergoing rejection.

Another application lies in analyzing samples from a patient known to beat risk of a disease but in which symptoms of disease are not yetpresent. The risk can be genetic, such as a genetic mutation associatedwith disease or family history of the disease, or arise as a result ofexperience, for example, exposure to a toxic chemical, radiation,traumatic accident, stress, fatigue, chemotherapy, unprotected sex, orexposure to a subject with a contagious disease. Such a patient isnaturally concerned about the possibility of acquiring a disease andearly therapeutic intervention. The methods are particularly useful incrisis situations in which many subjects have had potential exposure toa risk. Conventional diagnostic assays often have a significant lagperiod before a disease can be developed. For example, conventionalviral assays can take several months to develop detectable patientantibodies. Autoimmune diseases (e.g., lupus, type 1 diabetes,rheumatoid arthritis, multiple sclerosis) can take several years todevelop specific autoantibody or T-cell responses to specificautoantigens. By contrast, the present methods can detect changes in aprofile within a few days (e.g., less than 10, 5 or 3 days) of exposureto a risk, or infection. The changes in binding profile may reflectsubtle changes in concentrations of many different components of asample, few if any of which would be individually detectable. However,in the aggregate, the changes in binding profile of the compounds in thearray indicate a change if the risk has started development of disease.

Another application lies in forensic analysis of a sample, for example,a sample recovered from a crime scene or a sample relevant to apaternity analysis. Comparison of a test sample with one or morereferences samples of known origin can provide an indication of thesource of the test sample.

Binding profiles can be used in a variety of ways in characterizing asample. In some methods, a binding profile of a sample is compared withone or more reference binding profiles of the same compounds. Areference binding profile is a profile that characterizes a particulardisease, stage of disease or lack of disease, and the like. Referenceprofiles are typically determined by averaging binding profiles ofseveral samples (e.g., at least 2, 20, 50 or 100) each characterized forthe same disease, stage of disease or lack of disease.

Comparison of a sample binding profile with a reference binding profilecan involve comparing the different binding strengths of differentcompounds in an array to the respective samples to derive a valuerepresenting the overall similarity of the profiles. A measure ofsimilarity on a scale of similarity is by implication an inverse measureof disimilarity and vice versa. Thus, a value representing the overallsimilarity includes a value representing the overall disimiliarity.However, mathematically disimiliarity matrices can be handled andanalyzed distinctly from similarity matrices. Raw data from the samplebeing analyzed can of course be normalized before the comparison toeliminate any differences due to sample size, processing, concentrationand the like, rather than relative representation of sample components.Standard ANOVA analyses can also block such nuisance factors, providedsuch factors are accounted for in the experimental design.

Various techniques can be used to derive a value based upon thecomparison of a binding profile and a reference binding profile. Aderived value can be used to measure the dissimilarity between thebinding profile and the reference profile and be evaluated using adistance measure such as the Euclidean Distance (ED) metric. The EDmetric is typically used for measuring the distance between two vectorsof “n” elements. According to one implementation, if x=(x1, x2, x3, . .. , xN) and y=(y1, y2, y3, . . . , yN) are two points in EuclideanN-space, then the Euclidean distance between x and j may be computed as:

D _(xj)=SquareRoot(Summation((x _(i) −y _(i))²))

The ED metric thus not a correlation (0 to 1), but a measurement ofdissimilarity.

In the context of comparing a binding profile (defined by its bindingvalues for each point in N-dimensional space, where N is the number ofexperimental points (conditions)) with a reference binding profile, a EDmetric can be determined regardless of the complexity, number ofpeptides, or number of patients. Each profile being compared may be seenas a pattern: setting an explicit series of points across time, acrossdilutions, across disease states, across symptoms, etc., and thecomparison described here looks for data that reflects this definedseries of points.

To standardize the difference between binding profiles being compared,the calculated ED measurement may be normalized by dividing by thesquare root of the number of conditions as follows:

Distance=|a−b|/square root of N

This is distinct from the aforementioned distance calculation bynormalizing for the total number of conditions. This prevents thedistance calculation from expanding too far given large numbers ofsamples.

Accordingly, calculating the Euclidean distance between two data pointsinvolves computing the square root of the sum of the squares of thedifferences between corresponding values. Because the ED metric is ameasure of dissimilarity, the distance (d) may be converted, whenneeded, to a similarity measure as 1/(1+d). Distance, similarity, anddissimilarity are interchangeable to a certain degree but each is auniquely useful given the calculations being applied. As the distancegets larger, the similarity gets smaller. This renders the original datauseful for looking at differences in a non-biased and geometrical way.The computation is scalable with increasing number of experiments. Infact, the complexity of the pattern is inherently diminished to thecalculation because it is in the denominator and is a square root.

Other distance metrics that can be used include Euclidean Squared,Pearson Correlation, Pearson Squared, Spearman Confidence orCorrelation, and other like techniques.

Binding profiles can also be used in various analytical methods tofurther characterize the sample. For example, a compound in the arrayshowing relatively strong binding to the sample (compared with othercompounds in the array) can be used to affinity purify a component ofthe sample. The component can then be further characterized (e.g., bysequencing or immunoreactivity). The identity of the compound may becharacteristic of a disease state (e.g., a pathogen, autoantibody ortumor associated antigen). If the component is not already known to becharacteristic of a disease state, it can be used as a new target fordeveloping therapies or diagnostics against the disease state. Forexample, autoantigens or peptides thereof, can be used in inducingtolerance of autoimmune disease. Alternatively, after washing offunbound cellular components, the cellular components binding to an arraycan be dissociated from the array, fractionated and analyzed in similarfashion. In a further variation, the identity of a compound in the arrayshowing relatively strong binding to a sample can be used to identify aligand of the component bound in the sample, and hence the component inthe sample. For example, if the compounds of the array are peptides, thesequence of a peptide showing relatively strong binding to a sample canbe compared with a database of protein sequences. Comparison can bepairwise between a database sequence and a peptide in the array orbetween a database sequence and a motif or consensus sequence from aplurality of peptides in the array. Sequence similarity to a protein inthe database provides an indication that the protein is a ligand of thecomponent in the sample to which the peptide showed strong binding. Theidentity of a ligand in turn provides at least an indication ofpotential molecules in the sample and in turn disease statescharacterized by such molecules.

The same array can be used in any of the applications described aboveand for virtually any disease or suspected disease state. The same arraymeans either literally the same array, in which case the array may bewashed between different samples, or different copies of an array of thesame composition. The identity of which compounds in the array are mostinformative for a disease or other state being analyzed varies by state.Thus, having identified the most informative compounds for a particulardisease, derivative arrays or other detection devices and kits can bemade that have a reduced number of compounds including the mostinformative compounds. The derivative arrays are sometimes referred toas secondary arrays to distinguish them from primary arrays used ininitial identification of binding compounds and sometimes a samplecomponent bound by these compounds.

A further useful aspect of the present methods is that they can detectnot only increased binding of compounds to cellular components in testsamples relative to a control sample representing an undiseased subject(typically a human) but can also detect decreases. For example, somesample components, particularly antibodies, can be detected to decreasein a test sample, such as a disease or vaccinated sample or any other ofthe samples types mentioned, and other sample components increase.

VI. Derivative Analyses

As well as being useful in themselves for analyses of samples asdiscussed above, the present methods are also useful for determiningderivative compounds and detection devices. In a simple form of suchmethods, a derivative device or other array in constructed containingone or more compounds known to be associated with a given disease,susceptibility to disease or other condition described above, andomission of other compounds from the primary array not found to beinformative for this disease, susceptibility or other condition. In somesuch methods, only a small proportion of the compounds used in a primaryarray (e.g., less than 0.1%, 1% or 5% are retained). In other methods, acomponent of the sample bound by some of the compounds in a primaryarray is identified by any of the approaches discussed in the previoussection. Having identified a component of the sample, one or more knownbinding partners of the component are also identified. The known bindingpartners can be compounds from the primary array, antibodies to thecomponent or other compound, such as a synbody that is known to bind tothe component. The known binding partner(s) can then be used to detectthe sample component to which they are known to being by any otherwiseconventional diagnostic assay. For example, if the known binding partneris an antibody, the assay can be an ELISA, immunoprecipitation,radioimmunoassay or the like. If a plurality of known binding partnersare used, the known binding partners can be immobilized in an arrayformat. The known binding partners can also be incorporated intodiagnostic kits or diagnostic device (e.g., attached to a support). Sucharrays, diagnostic devices and kits can be manufactured by conventionalmeans. Of course, once the known binding partners of a component havebeen identified, it is not necessary to repeat the initial screeningwith the primary array for subsequent manufacture of such arrays,diagnostic devices and kits.

Although the invention has been described with reference to thepresently preferred embodiments, various modifications can be madewithout departing from the invention. Unless otherwise apparent from thecontext any step, element, embodiment, feature or aspect of theinvention can be used with any other. All publications (includingGenBank Accession numbers and the like), patents and patent applicationscited are herein incorporated by reference in their entirety for allpurposes to the same extent as if each individual publication, patentand patent application was specifically and individually indicated to beincorporated by reference in its entirety for all purposes. If more thanone version of a sequence is associated with a deposit number atdifferent times, the version associated with the deposit number at theeffective time of filing the application is meant. The effective time offiling means the earliest application from which priority is claimeddisclosing the relevant accession number.

EXAMPLES Example 1 Array Binding to Monoclonals

Triplicate copies of an array having 10,000 different random peptidesmade by robotically spotting distinct polypeptides on a glass slidehaving an aminosilane functionalized surface as described above weretested for binding to individual antibodies were hybridized at 100 nM,37° C., 8 rpm, 1 hour and detected with 5 nM secondary antibody. Theantibodies tested included; (1) monoclonals, (2) polyclonals, (3)unknown epitope, (4) monoclonals with linear epitopes, (5) monoclonalswith discontinuous epitopes, (6) anti-Fc antibodies, (7) polyreactiveantibodies, (8) autoantibodies, (9) mixed monclonals, and (10)antibodies to glycans. Triplicate arrays were mathematically averaged,and the most informative 800 peptides were used to distinguish therelative differences. The binding of each antibody or antibody mixturesto the 800 peptides can be represented by a series of colored bands asshown in the right of FIG. 1. Each band represents binding to adifferent one of the 800 informative peptides in the array. Differentcolors can be used to represent the different strengths of binding, forexample, red being the highest, blue the lowest and yellow intermediate.The apparent binding affinities of most of the informative peptides tothe antibodies tested range from about 10⁴ M⁻¹ to 10⁶ M⁻¹ or 100(background, e.g., an empty cell of an array lacking a peptide) to65,500 in intensity unity. The left portion of FIG. 1 shows principalcomponent analysis of the patterns shown in the right hand proportion ofa figure. Principal component analysis represents the binding profilesof different antibodies as spots on a two-dimensional chart, such thatthe relative distance between the spots is a measure of the relatednessof patterns. The principal component analysis shows that each of theantibodies has a distinguishable binding profile. The analysis alsoshows that technical replicates are very reproducible per array and perpeptide. The distinct binding patterns of different monoclonal andpolyclonal antibodies are also shown in FIG. 12.

FIG. 11 shows an analyses of the number of different peptides binding todifferent numbers of antibodies. For example 1103 peptide bound to oneand only one of the antibodies tested. 402 peptides bound to two of theantibodies tested. Six peptides bound to ten different antibodies and soforth. Thus, the different peptides have different degrees ofpromiscuity in binding to multiple targets. Such a range ofpromiscuities can be of assistance in using an array to characterizedmultiple samples.

Of over twenty types of monoclonal antibodies, each produced a distinctpattern on the array. These differences included both what peptides bindand the relative binding to each. Signals on the array varied over 3logs of fluorescence intensity. The results show antibodies againstsugars, non-natural protein sequences (e.g. frameshift peptides,translocation junctions, splice variants), self-proteins,post-translational modifications (PTM's), multi-species, and multi-class(IgG, IgA, IgM, IgE) are all detected. Because each antibody has adistinct pattern of binding, one might expect that the 10⁸ repertoire ofserum antibodies would create a monotonic pattern indistinguishablebetween different samples. Indeed a mathematical reconstruction of thebinding of 30 monoclonals indicates that the distinctive patterns startto be lost. Remarkably as shown in subsequent examples, this is not thecase in reality.

Example 2 Dilution

A single monoclonal antibody, p53Ab1, was used to test the ability todiscern a signature or binding profile in the presence of competing IgG.As before, 100 nM p53Ab1 was hybridized to the arrays to detect itssignature. We added 1, 10, and 100-fold pooled human IgG as competitorand could still distinguish the original signal. For the 2-colorexperiments, IgG was detected with a red fluor, p53 with a green fluorand the competing spots were detected by multi-wavelength scanning. FIG.2 shows the signal from the most informative peptides in the differentconditions tested. The signal from the monoclonal stands out against thecomplex background of the IgG serum. This is a key and unexpected aspectof the immunosignaturing technique. High affinity immune responses, likemonoclonals mixed in serum, stand out on the arrays from the normalantibodies.

Two antibodies were selected against p53, Ab1 and Ab8. Each was titratedinto the other yielding a gradient of signatures. FIG. 3 shows thebinding intensities to the most informative peptides in which differentcolors are used to represent different binding strengths. As seen thestrongest pattern, Ab8, is seen in the presence of another antibody andin the presence of competing IgG. Eight antibodies mixed together show atrend towards monotonic response. 100×IgG shows exactly this. Dilutionshows that pattern of p53Ab1 dominates a poorer antibody, p53Ab8 andexcess IgG. Physically pooled antibodies and antibody signalsmathematically averaged show similar patterns. Secondaries andtertiaries were not detectable.

Example 3 Resolving Disease

We collected normal donor sera and sera from patients infected withValley Fever and Influenza vaccine recipients. Valley Fever patientswere previously typed against CF antigen for titer. FIG. 4 shows thebinding profiles of different patients for the most informative patientswith different color bands representing different binding intensities.In general, normal undiseased patients, showed a low intensity pattern.Patients infected with cocci showed a much higher intensity pattern evenwhen no titer of cocci could be detected. Patients inoculated with fluvaccine showed varying patterns likely corresponding to the strength ofthe immune response. These patterns were readily distinguishable fromthe normal patients and cocci infected patients.

In a further experiment, samples from about 200 cancer patients havingvarious kinds of cancer including glioma, sarcoma and pancreatic cancerwere tested. FIG. 5 shows the binding profile of the most informativepeptides in comparison with cocci infected patients and asthma patients.Principal component analysis confirms that the patterns of patients withthese different diseases are distinct, and the patterns of cancerpatients show clustering by the type of cancer.

In a further experiment, samples from patients at risk of pancreaticcancer and/or having pancreatitis were compared with patients havingpancreatic cancer, patients having sarcoma or normal patients. Principalcomponent analysis (FIG. 6) shows that some of the patients at risk ofpancreatic cancer have a binding profile clustering with patients havingcancer. However, patients having pancreatitis cluster separately frompatients having pancreatic cancer. Thus, binding profiles can be used asa predictor of cancer development in at risk patients and to distinguishpancreatitis from cancer.

FIG. 7 compares binding profiles of normal patients, breast cancerpatients, and patients at risk of breast cancer. The binding profilesdistinguish breast cancer patients from normals, and classifypre-symptomatic cancer patients as having a breast cancer pattern yearsbefore detection of a tumor. The binding profiles of pre-symptomaticsamples also differ from normals because of their lack of variability (ahallmark of normals).

Example 4

A sublethal influenza A/PR/8/34 infection of 1×10⁴ viral particles wasused to impart protection against a later lethal challenge of the samestrain in BALB/c mice. Lethal challenge using 2-5 mean lethal doses ofinfluenza A/PR/8/34 occurred on day 35. The resulting immune responsewas protective as no overt clinical signs of infection or weight losswere observed following challenge. Portions of this example have beenreported by the inventors in Lugutki et al, Vaccine (May 5, 2010) PMID:20450869.

Serum from live influenza immunized mice were run on the arrays. At day14, the influenza-specific IgG ELISA titer was 1:102,400 indicatingvirus specific antibodies were present. Difference in antibody bindingto the CIM10K array could be detected 14 days after immunizationconfirming that a change in humoral immune status could be detected.Comparison of the 271 peptides recognized at least two-fold greater bylive influenza immunized mice did not overlap with the 542 peptidesrecognized at least two-fold more than age matched naïve mice immunized29 days previously with F. tularensis LVS. Therefore an array of 10,000random peptides are sufficient generate immunosignatures can distinguishbetween two infections.

Mice were infected with influenza and 21-day antisera was tested for animmunosignature pattern. We found a specific influenza pattern in micethat could be blocked using whole virus particles pre-adsorbed toantisera from infected mice but not by an irrelevant virus (M13) (FIG.8). Comparison of the absorbed sera in an ELISA revealed a reduction inIgG reactivity for the virus from an endpoint titer of 512,000 for theunabsorbed 387 sera to less than 500 for the virus absorbed sera.Reactivity for the possible to determine antibody reactivities thatdeclined on immunization in a standard ELISA.

To determine whether these peptide populations were the result of theprimary immunization or a combined effect of immunization and challenge,the recognition of these peptide populations were compared in immunizedonly mice. At 118 days post-immunization, immunized only mice had a meanfold increase from day 0 of 5.18±3.57 for the 283 time stable peptidescompared to immunized and challenged mice at 98 days post-challengewhich had a mean fold increase of 4.19±2.55 from day 0. At the same timepoint, the suppressed peptides had a similar fold change 350 from day 0of 0.39±0.29 in the immunized mice and 0.36±0.098 in immunized andchallenged mice. Taken together these results demonstrate that a lastingand complex immunosignature is generated upon initial infection and itis unique to the infecting organism.

We sought to determine whether the immunosignature on the CIM10K wastransient or remained consistent over time and correlated to theantibody titer in a virus specific ELISA. The immunized animals werefollowed for 211 days post-challenge and the changes to peptiderecognition levels by serum antibodies on the CIM10K array observed. Ina standard ELISA against whole virus, the IgG titer rose to 819,200 bythe day of challenge and remained elevated and unchanged during thepost-challenge observation period. An “expression pro-file” reflectingthe changes in bulk IgG ELISA titer was established in the Gene Springsoftware package. Comparison of the relative fluorescence intensities ofrecognized peptides to the profile identified peptides whose intensitiesover time matched the defined profile with a Pearson correlation ofgreater than 0.9. Profile analysis was similarly used to identify setsof peptides whose recognition was increased after immunization butdiminished after challenge and those that were initially recognized atday 0 but were no longer recognized following immunization. At 118 dayspost-immunization, immunized only mice had a mean fold increase from day0 of 5.18±3.57 for the 283 time stable peptides compared to immunizedand challenged mice at 98 days post-challenge which had a mean foldincrease of 4.19±2.55 from day 0. At the same time point, the suppressedpeptides had a similar fold change from day 0 of 0.39±0.29 in theimmunized mice and 0.36±0.098 in immunized and challenged mice. Takentogether these results demonstrate that a lasting and compleximmunosignature is generated upon initial infection and it is unique tothe infecting organism.

We sought to determine whether the antibody immunosignature on theCIM10K correlated to the immunizing dose of virus. Groups of BALB/c micehad increased weight loss and recovery time which correlated toincreased infectious dose. Mice were allowed to recover to pre-infectionweight and serum was collected at day post-infection. Bulk IgG endpointtiters against the virus were indistinguishable between groups with anendpoint titer of 102,400 for all groups. This suggests that the maximumdetectable response in an ELISA was quickly reached at the lowestimmunizing dose. Computer analysis of the CIM10K immunosignaturesrevealed 516 peptides that generally increased with infectious dose, ofwhich a subset of 65 rose sharply. Comparison of the time stablepeptides to the dose responsive peptides showed an overlap of 39peptides, three times that predicted by chance. These resultsdemonstrate that immunosignature has greater resolving power than wholevirus ELISA in terms of distinguishing the infectious dose.

Antibody responses to infection or vaccine can be measured by ELISA foreach of the responding isotypes. We sought to determine whether or notthe antibody reactivity on the random CIM10K peptide array could detectchanges in antibody isotype. The IgM, IgA, IgG1, IgG2a and IgG3immunosignatures were determined using serum samples from day 0 and day28. Peptides recognized with relative fluorescence intensities greaterthan array features containing only buffer were analyzed. Populations ofpeptides recognized by the different isotypes are presented as amodified Venn diagram in FIG. 5. Peptide overlap sets containing moremembers than that predicted by chance overlap for equivalent sized listswere limited to certain overlap regions suggesting potential classswitching from IgM and overlap with IgA indicating the switch to IgAthat is expected in containing an airways infection. Fitting an immuneresponse to a viral infection, the IgG subtype response waspredominantly IgG2a where nearly half the array was increasingly overtwo-fold recognized by day 28 serum. Conversely, only 144 peptides wereover two-fold recognized by IgG1 antibodies. This demonstrates that theimmunosignature can be subdivided based on antibody isotypes whichreflects the pathogenesis of the responsible agent. In contrast to theELISA protocol, each isotype assay only required 0.5 μl serum.

We sought to determine whether or not the immunosignature was consistentbetween biological replicates. Additional groups of BALB/c mice wereseparately obtained approximately 2 months apart and independentlyinfected with the sublethal dose used for immunization. Weight loss wasconsistent for all groups and at day 28 each group had a bulk IgG titerof 19,200. To address the consistency of the immunosignature acrossindependent infections, day 28 sera from the three infections wereallowed to bind the CIM10K array and fluorescence intensities compared.For all peptides on the array the Pearson correlation was 0.94 betweeninfections. The time stable peptides had Pearson correlations of 0.904,0.936 and 0.912 between infections. Compared to the bioinformaticaverage of all naïves tested, the three infections had similar foldincrease values for the 283 time stable peptides (2.23±1.22, 2.88±1.0,and 1.93±0.97). These results demonstrate that the immunosignature isconsistent across biological replicates of infection and acrosstechnical replicates. There is no loss of consistency between biologicalreplicates when using random peptides in place of the authentic virusantigen.

Naïve inbred mice provide a relatively empty canvas on which to developan antibody repertoire. We sought to determine if a consistentimmunosignature could be distinguished in a diverse human population.The immunosignature to the human seasonal 2006-2007 influenza vaccinewas evaluated as our model. Individual sera from seven human donors wereallowed to bind the array and analyzed for differences in peptiderecognition between pre-immune and day 21. Donors were determined tohave serum IgG antibodies in an ELISA against the 2006-2007 seasonalvaccine. For the human donors the median day 0 IgG titer was 1:3200 andthe median day 21 titer was 1:12,800. Analysis identified 30 peptidesthat significantly increased (p<0.01) on day 21 at least 1.3 times thatof the pre-immune or donor background values. These 30 peptides weresufficient to clearly distinguish the immune and pre-immune classesusing principle component analysis. Four principle components wereidentified of 38%, 23%, 13% and 9% variance, respectively. One patientwas separated across the X-axis for both immune and pre-immune samplesand may suggest a low responding immune system. The remaining patientsamples were still distinguished by class when this patient was excludedfrom the analysis. These results demonstrate that the immunosignature ofserum antibodies are informative.

It would be of great advantage to know what antigen had produced adiagnostic antibody. We have developed a protocol that allows this“backtracking” from the array peptide to the natural protein. Basicallythe reactive peptide is resynthesized on Tentagel Beads. The beads arereacted with the serum to affinity purify the antibody. The antibody isthen released from the beads. This antibody can then be used against anextract to identify the protein. Another method would be to make a phagelibrary of antibodies from the individual producing the immune responseand isolate the phage-ab binding the peptide of interest.

An array peptide that showed strong reactivity against influenza but notnormal mouse sera was selected to pull down antibodies against influenzaspecifically. The pull-down was done using 10 ug peptide chemicallyconjugated to Tentagel® beads. Antibodies were eluted with pH 2.0glycine, then immediately neutralized. PR8 was immobilized onnitrocellulose and probed with the pulled-down antibodies. FIGS. 9A, Band C show the pulled-down antibody detecting PR8 particles, a positivecontrol anti-PR8 antibody picking up the virus particles, and a negativecontrol pull-down from the beads alone. The data indicate we pulled downthe appropriate anti-PR8 antibody with the peptide from the microarraythat showed the strongest reactivity to PR8-infected mice.

The immunosignature can be used to classify the recipient based onimmunogen. Protection against influenza infection in a murine model islargely based on the generation of neutralizing antibodies. To test theability of the immunosignature to distinguish a protective antibodyresponse from a non-protective antibody response, the archived serumsamples were used to probe the CIM10K array. A random number generatorwas used to assign samples to either set A or set B so that each setcontained four A/PR/8/34 infected groups, five KLH immunized groups andseven naïve groups. This way either set could be used as the trainingset and selected peptides used to predict the members of the second set.Peptides were selected from the CIM10K using the association algorithmin Genespring 7.3. For training on set A, 515 peptides with a median pvalue of 2.32×10⁻⁷ were identified. TSupport vector machines were usedto test the ability of the 515 to predict the immunogen used. When set Awas used to train for prediction of set B, 81.3% of the samples werepredicted correctly. When set B was used to train for prediction of setA using the 515 peptides, 93.8% of the samples were predicted correctly.

As a test of the validity of the methodology to select predictivepeptides, association algorithm was applied to set B to identify 518peptides with median p values of 6.34×10⁻⁵ for set B and 2.00×10⁻⁵ forset A. Support vector machines trained on set B to predict set Aidentified 81.3% of the samples correctly and training on set A topredict set B identified 75% of the samples correctly. The identifiedlists of associated peptides from set A and set B overlapped by 122peptides which is greater than predicted by chance overlap. The overlaplist predicted 81.3% of the samples correctly regardless of which setwas the training set or the test set.

Immunization with homologous and heterologous vaccines produce differentantibody responses and degrees of protection to challenge with InfluenzaA/PR/8/34.

We next sought to determine whether the immunosignature had the power todistinguish similarly composed vaccines and stratify them based onoutcome. As our model vaccines we used inactivated A/PR/8/34 as thehomologous “good” vaccine and the 2006/2007 & 2007/2008 human seasonalinfluenza vaccines as the heterologous “bad” vaccines. Efficacy wasdetermined in a murine lethal challenge model of influenza A/PR/8/34infection. At day 40, serum IgG titers were measured. Mock immunizedmice showed no reactivity for either A/PR/8/34 or the seasonal vaccines.Mice immunized with inactivated virus had IgG titers for all threeantigens with a lower titer for the 2007/2008 seasonal vaccinecomponents. The seasonal vaccines showed cross reactivity for each otherwith low reactivity for the A/PR/8/34 virus. Cross reactivity betweenseasonal vaccines is expected due to the inclusion ofA/Wisconsin/67/2005 and B/Malaysia/2506/2004 in both vaccinecompositions. The heamagglutanin and neuraminidase sequences fromA/PR/8/34 do have some homology to the A/New Caledonia/20/99 andA/Solomon Islands/3/2006 sequences but appear to have generated aunidirectional cross reactivity in the ELISA.

Pooled serum samples were also used to probe the CIM10K. The associationtest was applied to the array data and 367 peptides associated withimmunogen were identified. These peptides are sufficient to distinguishthe groups with an average intergroup p value of 3.9×10⁻⁵. Thisdemonstrates that the CIM10K has the resolving power to distinguishclosely related immunogens from each other based on the humoral immuneresponse. To account for the heterosubtypic crossreactive antibodiesobserved in the ELISA the three inactivated vaccines were compared tomock immunized. This comparison yielded 116 peptides with a p<0.01 ofwhich 54 were increased in the vaccines over mocks. Relative intensitylevels of the 54 peptides were comparable between groups. This list ofpeptides was tested in support vector machines for the ability toseparate the A/PR/8/34 and the KLH immunized mice. For the combined setsA and B, no crossvalidation error was observed suggesting this set islikely composed of influenza specific cross reactive antibodies.

At challenge on day 42, mice immunized with inactivated A/PR/8/34survived without clinical signs of illness. Mice immunized with theseasonal vaccines became sick and had 60% survival in the 2006/2007seasonal vaccine group and 70% survival in the 2007/2008 seasonalvaccine group. Surviving mice in the seasonal vaccine groups recoveredto prechallenge weight This demonstrates that the vaccines are notequally protective in challenge with A/PR/8/34. Taken together thesedata demonstrate that the different vaccines generate distinctive immuneresponses as determined by whole virus ELISA and CIM10K and havedifferent protective outcomes.

The immunosignature of live immunization and subsequent challenge canpredict the efficacy of inactivated vaccines. To evaluate if theimmunosignature of a naturally acquired infection and resistance tosubsequent infection could predict the efficacy of an inactivated orsubunit vaccine, we examined microarray data from mice infected with asublethal dose of A/PR/8/34 and subsequently challenged with a lethaldose of the same strain 35 days later. We first asked if the peptidesthat were recognized by serum antibodies present in immune mice at thetime of challenge were capable of predicting vaccine efficacy. 50peptides significantly (twofold increase with a p<0.01) increased fromday 0 to day 28. Support vector machines trained on the immune and naïvemice (day 28 and day 0) for testing on the model vaccine trail wereincapable of distinguishing the groups, suggesting an amplification ofprotective clones was required for survival in a subsequent challenge.

To test this hypothesis, we asked if peptides increasing between pre andpost challenge could predict the efficacy of the inactive vaccines.Analysis identified 163 peptides as increasing greater than twofold frompre to post challenge with a p value less than 0.05. Support vectormachines trained on the immune and naïve samples (day 28 and 0) wereable to predict the challenge outcome in terms of success defined by nochange in health status (inactive A/PR/8/34) or unsuccessful defined byillness (seasonal vaccines and mock). This demonstrated that the abilityof the immunosignature to identify the antibody reactivities responsiblefor protection to a second exposure can predict the efficacy of asubunit or inactivated vaccine.

Example 5 Multi-Disease Classifier

About 875 individual samples from individuals with infectious disease,autoimmune disease, cancer, Alzheimer's and other diseases were analyzedindependently on a 10,000 peptide array. The binding profiles wereinitially represented as a heat map as for other profiles (FIG. 10A).Principal component analysis of the binding profiles for differentdiseases is shown in FIG. 10B. Each of the diseases has adistinguishable profile. Although inflammation is a major component inmany of the diseases, it is not a major contributor to their bindingprofiles, which remain predominantly distinct with a classifier errorbelow 10%

Example 6

The following table summarizes the results from several tests onclinical samples. FP (false positive, FN (false negative), AUC (areaunder a ROC curve, a measure of accuracy of diagnosis).

Class Disease FP FN AUC # disease samples Pre-disease panIN (pre-panc.Cancer) 0 0 1 8 human (presymptomatic) Vaccine and Influenza 0 0 1 15human, 4 years challenge 0.001 0 0.999 120 mice Pancreatic Cancer 0.2 08 human Infectious Valley Fever (Cocci) 0 0 1 40 human disease 0 0 1 9dog 0 0 1 60 mice Q-Fever (Coxiella) 0 0 1 30 rabbit Tularemiatularensis 0 0 1 21 mice Rickettsia rickettsii 0 0 1 37 mice Glanders(B. mallei) 0 0 1 26 mice Chronic Asthma IgG 0 0 1 25 human Asthma IgE 00.01 0.97 25 human Asthma IgA 0 0 1 6 human Autoimmune Lupus 0 0 1 60mice Type I diabetes 0.13 0.02 0.80 56 human Cancer Glioblastoma 0 0 110 human Breast 0.01 0.009 0.986 156 human Multiple primary 0 0 1 7human Pancreatic cancer 0.06 0.02 0.92 122 human Lung 0 0.11 0.81 60human Myeloma 0 0 1 15 human Ovarian 0 0 1 6 human Esophageal 0 0 1 2human Other Transplant 0 0 1 6 human

Example 7 Epitope Mapping

Motif finding algorithms are able to find subtle patterns in sets ofunaligned sequences. These algorithms may be classified in two maincategories: deterministic and optimizing. Deterministic algorithmsexhaustively search a sequence set for motifs fitting a well defined setof criteria. Some popular implementations of deterministic motif findingalgorithms are TEIRESIAS or PRATT (Rigoutsos, Bioinformatics. 14, 55-67(1998); Jonassen (1997) Comput. Appl. Biosci. 13, 509-22).

The optimizing algorithms represent the motif probabilistically and tryto maximize a scoring function. The optimization can be preformedstochastically such as using Gibbs motif sampling or by expectationmaximization as implemented in MEME (Bailey (2006) Nucleic Acids Res. 1,W369-7316).

An optimization approach is preferred when it is not known what criteriathe motif should fulfill. The GLAM2 implementation of the Gibbs motifsampling algorithm will be used here because it allows for gaps (Frith(2008). PLOS Comput Biol. 4, e1000071.17).

An alternative to finding a motif among the peptides is to compare thepeptides one at a time to the antigen sequence(s). The algorithmimplemented in the RELIC MATCH 5 program compares each peptide sequenceto the target protein sequence in five amino acid windows, and scoreseach window for similarity (18). The scores for all of the peptides areadded up across the protein sequence to predict potential small moleculebinding site. A similar approach can be used for predicting antibodyrecognition sites from dissimilar peptide sequences selected in apeptide microarray experiment.

Typically, epitope mapping is performed to identify the specific part ofa protein target that is recognized by the antibody. A similar approachcan be used to identify an unknown protein target of an antibody. Thisapproach can be used in identifying the antigenic proteins in apathogen, targets in an autoimmune disease, or discovering the cause ofan unknown infection.

Antibodies with known epitopes were purchased from Labvision (Fremont,Calif.) and Abcam (Cambridge, Mass.). Mice were immunized with keyholelimpet hemocyanin (KLH) conjugated peptides and sera was obtained at day35. After slides were passivated with 0.014% Mercaptohexanol, antibodywas diluted to 100 nM or sera was diluted 1:500 in 3% BSA, 0.05% Tween,PBS. Antibodies were incubated with slides for 1 hour at 37 C in AgilentChambers with rotation. Slides were washed three times with TBS, 0.05%Tween and three times with diH2O. The incubation and wash procedure wasrepeated with a biotinalyted secondary antibody (Bethyl Laboratoreis,Inc. Montgomery, Tex.), then with Alexa-555 labeled Streptavadin(Invitrogen, Carlsbad, Calif.).

Name Isotype Num of AA pI Hydropathicity mAb1 IgG1 9 3.56 −0.9 mAb2 IgG19 8.75 −0.044 mAb3 IgG1 5 9.76 −0.02 mAb4 IgG2a IgG2b 6 5.84 0.517 mAb5IgG1kappa 6 4.67 −0.583 pAb1 polyclonal IgG 20 4.21 −0.185 pAb2polyclonal IgG 20 4.43 −1.245 pAb3 polyclonal IgG 20 6.4 0.87 pAb4polyclonal IgG 20 8.64 −0.13 pAb5 polyclonal IgG 20 3.49 −0.315

Negative control arrays with no primary antibody or naïve mouse serawere also run for comparison. At least three replicate arrays were runfor each antibody.

Antigen specific antibodies were absorbed from sera by binding to KLHimmobilized on nitrocellulose membrane. A one by six centimeternitrocellulose membrane was placed in a 15 ml conical tube with 1.0mg/ml KLH in 2.0 ml PBS. The membrane was washed three times in TBST andincubated with 1.0% BSA in TBST for at least one hour or until used.After washing three times in TBST, the membrane was placed into 2.0 mlof sera diluted 1:500 in 3% BSA, 0.05% Tween, PBS buffer. Reactivity ofsera for KLH was tested in an ELISA. Sera were considered absorbed whenno reactivity for KLH was detected at the 1:500 dilution.

Negative control signals were subtracted from antibody signals to removethe contribution of the secondary binding. The top 500 peptides influorescent intensity were selected for each antibody. The number oftimes each peptide occurs in one of the top 500 peptides lists wastabulated. Peptides appearing in five or more lists were eliminated asthey are likely Fc binders or other nonspecific interactions. Peptidesfrom the array were compared to the epitope sequences to identify thosewith sequence similarity. The epitope was expressed as a GLAM2 motif andwas used in GLAM2SCAN to search against the peptides from the arrayinserted in strings of cysteines, with an alphabet of equal amino acidfrequencies. Peptides were sorted by the highest scoring match and listsof the best matching peptides were created. These lists were comparedwith lists of peptides that most strongly bind to each peptide and theproportion of overlap was examined. Test datasets were generated for themonoclonal antibodies by randomly selecting sequences from humanSwissprot and then randomly selecting a window of that sequence the samelength as the epitope sequence. Two hundred negative examples weregenerated for each monoclonal. One thousand random peptides weregenerated as the negative examples for the polyclonal antibodies withequal frequencies of the nineteen amino acids (cysteine was not includedas in the arrays). All of these sequences were inserted within a stringof seventeen cysteines on each side to allow peptides to be alignedoverhanging the test sequences.

Motifs were generated from the peptide lists using GLAM2, with astarting width of five amino acids, 1,000,000 iterations withoutimprovement, 10 runs, and an alphabet of equal proportions of the 20amino acids. GLAM2SCAN was used to search the corresponding test setsfor sequences matching the motif with the alphabet set as the defaultprotein alphabet for the monoclonal antibodies or equal amino acidfrequencies for the polyclonal antibodies.

GLAM2SCAN output is the score for each place the motif matches in thetest sequence set. The test sequences were ranked by the highest scorematch within each sequence. The RELIC Fastaskan program was used toalign the binding peptides to the test dataset.

The top 500 specific peptides were uploaded as the affinity selectedpeptides and the corresponding test dataset was uploaded as the FASTAfile. Random peptides were not subtracted. Fastaskan compares each fiveamino acid window of the test sequence with the selected peptidesequence and summing scores of the alignments above a threshold. Itoutputs a score for each test sequence corresponding to the window ofmaximum similarity between the peptides and that sequence.

For both the GLAM2SCAN and the RELIC analysis, the rank of the trueepitopes was compared to the test sequences using ROC analysis. A Matlabscript to calculate the true positive and false positive rate for eachscore cutoff was obtained from world wide web//theoval.cmp.uea.ac.uk/˜gcc/matlab/roc/ and modified to smooth tiedscores. The area under the ROC curve was also calculated using a Matlabscript from the same website. The area under the curve is used topredict the probability of finding an epitope in a database of a givensize. We will assume positive and negative examples will be selectedfrom a database of a fixed size without replacement weighted by theprobability that a positive is chosen over a negative as estimated bythe area under the curve.

T evaluate the peptide microarray platform, examples of known epitopeantibodies were required to generate a test dataset. Five monoclonalantibodies with known linear epitopes, and five examples of anti-peptidepolyclonal mouse sera raised against peptides selected from the arraywere used as the test set. The monoclonal antibodies were found to bindto a median of 64.1% (range 37.6%-74.9%) of the random peptides abovethe slide surface background and secondary only controls. Polyclonalsera showed similar peptide reactivity with a median of 63.6% (range54.0%-68.6%). Replicate slides had an average Pearson correlation of0.785 for monoclonals and 0.764 for polyclonals. A heatmap showed thateach antibody has a distinct binding pattern on the array. Althoughthere is some overlap between the peptides bound by each antibody, about22% of the top 500 peptides recognized by each antibody are notrecognized by the other nine antibodies tested. The uniqueness of thepeptides recognized by each antibody implies that the peptide sequencesmay contain information about antibody specificity.

Each peptide sequence was scored for similarity against each proteinsequence. Most of the peptides bound by the antibodies did not showstrong sequence similarity to the epitope. However, there was someenrichment for sequence similar peptides among the binders. Most of thepeptides bound are mimotopes rather than having any obvious similarityto the epitope.

To assess the predictive power of these sequences the alignment of thepeptides to the epitopes was compared to their alignment with a set ofnegative examples. The RELIC alignment program was able to align bindingpeptides to all of the monoclonal epitopes and 62.7% of the negativeexamples. The true epitopes had an average score of 14.3 whereas thenegative examples had an average score of 5.9. The ROC analysis found anarea under the curve of 0.87 indicating that a true epitope has an 87%chance of having a higher score than randomly selected negative example.All of the polyclonals also had positive peptide alignment scores aswell as 86.5% of the positive examples. The true epitopes had an averagescore of 14.7 whereas the negative examples had a score of 15.2. The ROCanalysis (FIG. 5) indicates that a positive example has a 46% chance ofhaving a higher score than a negative example based on the area underthe curve. The monoclonal epitopes were predicted well by this method,whereas the polyclonal predictions were similar to chance. An algorithmcapable of detecting subtle patterns may be able to garnish predictivepower from these peptide sequences. Convergent motifs were identifiedfor all of the antibodies using GLAM2. The motifs for the monoclonalantibodies ranged from three to five amino acids in width. Thepolyclonal motifs were four to five amino acids wide. The monoclonalmotifs matched the epitope sequences with an average score of 3.5,whereas the negative examples had an average score of −3.7. Polyclonalmotifs matched the immunizing peptide with an average score of 3.8whereas the negative examples had an average score of 3.7. The ROCanalysis demonstrates that the monoclonals epitopes have an 89.8% chanceof being scored higher than the corresponding negative examples in themotif analysis while the polyclonals have a 67.9% chance of scoringhigher than the negatives. The motif finding approach demonstratedpredictive power on both datasets.

% of binders binders aligner both expected ratio align AbCamHA 350 181 65.88 1.02 1.7% DM1A 391 379 21 13.75 1.53 5.4% LNKB2 354 96 2 3.15 0.630.6% P53Ab1 379 188 36 6.61 5.45 9.5% P53Ab8 369 365 6 12.50 0.48 1.6%Neg1 258 722 26 17.28 1.50 10.1% Rco4 258 755 26 18.07 1.44 10.1% Rco3263 710 18 17.32 1.04 6.8% Rco2 274 742 22 18.86 1.17 8.0% Rco1 267 69921 17.31 1.21 7.9% average 316.3 483.7 18.4 13.07 1.55 6.2%

To test if combining the two approaches may improve the predictiveability, the scores from the RELIC analysis and the GLAM2 analysis wereeach scaled to have a minimum score of zero and a maximum score of oneand averaged. The ROC analysis was performed on the averaged scores. Thearea under the curve was 0.92 for the monoclonals and 0.69 for thepolyclonals. Based on the probability estimated from the ROC analysis,there i about a 70% chance of finding a monoclonal epitope in the topten windows out of a one hundred amino acid protein. There is about a21% chance of correctly identifying a polyclonal epitope in a smallvirus among the top 100 hits out of a possible 1000 amino acid database,which is a two-fold enrichment.

Example 9 General Materials and Methods

Preparation of the Random Peptide Microarrays. The random CIM10,000 fmicroarray consists of 10,000 20-residue peptides of random sequence,with a C-terminal linker of Gly-Ser-Cys-COOH. Each peptide wasmanufactured by Alta Biosciences, Birmingham, UK based on amino acidsequences we provided. Random sequences were provided by custom software(Hunter, Preston and Uemura, Yusuke, The Biodesign Institute). Nineteenamino acids (cysteine was excluded) were selected completely at randomin each of the first seventeen positions with GSC as thecarboxy-terminus linker. The synthesis scale was 2-5 mg total at ≧70%purity with 2% of the peptides tested at random by mass spectrometry.Dry peptide was brought up in 100% N,N′ dimethyl formamide untildissolved, then diluted 1:1 with purified water at pH 5.5 to a masterconcentration of 2 mg/ml. The original 96-deep-well plates wererobotically transferred to 384-well potting plates, and the peptideswere diluted to a final spotting concentration of 1 mg/ml concentrationin phosphate buffered saline at pH 7.2. High-quality pre-cleaned GoldSeal glass microscope slides were obtained from Fisher (Fair Lawn, N.J.,cat #3010). Each slide was treated with amino-silane, activated withsulfo-SMCC (Pierce Biotechnology, Rockford, Ill.) creating aquality-checked maleimide-activated surface designed to react with thepeptide's terminal cysteine. During spotting, we employ a TelechemNanoprint 60 using 48 Telechem series SMP2 style 946 titanium pins. Eachpin is allowed to spot approximately 500 pL of 1 mg/ml peptide per spot,which is estimated by allowing for pin trajectory, surface dwell time,and the amount of liquid each pin holds. The spotting environment is 25°C. at 55% humidity. Each peptide is spotted twice per array; the arraysare spotted in an orange-crate packing pattern to maximize spot density.Fluorescent fiducials are applied asymmetrically using Alexa-647,Alexa-555, Alexa-488 and Alexa-350-labeled peptides. The fiducials areused to align each subarray during image processing. The printed slidesare stored under argon at 4° C. until used. Quality control consists ofimaging the arrays by laser scanner (a Perkin-Elmer ProScanArray HT,Perkin Elmer, Wellesley, Mass.) at 647 nm to image the spot morphology.If the batch passes this test, further testing of randomly selectedslides with known proteins and antibodies allows QC of reproducibility.Array batches that fail to meet an array-to-array variability of 30%total CV are discarded. Data extraction is accomplished using GenePixPro 6.0 (Molecular Devices Inc., Sunnyvale, Calif.).

Probing the Random Peptide Microarrays with Serum Antibodies. Slideswere processed using a Tecan HS4800 Pro Hybridazation station with aprotocol adapted to antibody binding. General settings were washduration of 30 sec at 11.0 ml/min and sample agitation was set to high.Arrays were blocked with 1×PBS, 3% BSA, 0.05% Tween 20, 0.014%mercaptohexanol for 15 min at room at 23° C. Arrays were probed with 170μl of serum diluted at 1:500 in incubation buffer (3% BSA, 1×PBS, 0.05%Tween 20) for 1 hour at 37° C. Slides were washed between with TBST.Bound IgG was detected using biotinylated anti-mouse IgG (BethylLaboratories, Montgomery, Tex.) for 1 hour at 37° C. The anti-alphachain secondary antibody was detected using Alexafluor 649 labeledstreptavidin at 5.0 nM in incubation buffer for 1 hour at 37° C. Finalwashes in TBST and distilled water were done to remove residual salt.Slides were dried by on board nitrogen flow for 5.0 min. Images wererecorded using an Agilent ‘C’ type scanner at both 543 nm and 633 nm.

Analysis of Random Peptide Microarray Data. Statistical analysis ofmicroarray data was done with GeneSpring 7.3.1 (Agilent, Inc., PaloAlto, Calif.) by importing image-processed data from GenePix Pro 6.0(Axon Instruments, Union City, Calif.). Calculations based on theGenePix prepared gpr text files were done on the median signal intensityper spot. Poor quality spots were excluded from analysis by flagging as“absent” upon visual inspection. Prior to analysis, each array wasnormalized to the 50^(th) percentile to eliminate array to arrayvariation and signal intensities of less than 0.01 were set to 0.01.Values from triplicate arrays were averaged and used in the analysis.Informative peptides were determined by comparing groups on a peptide topeptide basis. Peptides with a relative fluorescence intensity of 500 orgreater and a p value <0.01 were selected. Further statistical analysiswas conducted in MicroSoft Excel 2003 SP3 or in GraphPad Prism version4.00 for windows (Graphpad Software, San Diego, Calif.). The principlecomponent analysis feature in Gene Spring was used to distinguish serumsamples based on selected gene lists. PCA was run using mean centeringand scaling. Support vector machine analysis was run in GeneSpring usingFisher's Exact Test limited to 50 peptides, polynomial dot product order1 and a diagonal scaling factor of 0 to generate cross validation errorsand predict test sets.

Characterization of archived murine serum samples. To test whether animmunosignature correlates with vaccine efficacy, even if protection isdriven by a small number of neutralizing antibodies, a sublethalinfluenza A/PR/8/34 infection was used as a model protective vaccine andKLH immunization as a model non-protective vaccine. KLH was chosen dueto its ability to generate an equivalently robust humoral immuneresponse. The average IgG titer against whole virus in A/PR/8/34infected mice was 819,000 and 800 for KLH immunized mice. Lethalchallenge using 2-5 mean lethal doses of influenza A/PR/8/34 occurred onday 35. The immune response generated by the sublethal infection wasprotective as no overt clinical signs of infection or weight loss wereobserved following challenge. Mice immunized with KLH contained eitherKLH alone or unrelated peptide-KLH conjugates adjuvanted with Alum. Theaverage anti-KLH IgG titer was 819,000 in KLH immunized mice and 1,600in A/PR/8/34 immunized mice. Following challenge at day 35, KLHimmunization imparted no benefit over naives. This indicates that themodel vaccines had equivalently strong immune responses for theirrespective immunogen with little ELISA cross-reactivity and noprotection due to immunization with KLH.

1. A method of analyzing a sample, comprising: (a) contacting the samplewith an array of immobilized different compounds occupying differentareas of the array, wherein different molecules of the same compoundwithin an area are spaced sufficiently proximate to one another formultivalent binding between at least two of the different molecules inthe same area and a multivalent binding partner; and (b) detectingbinding of the different compounds in the array to component(s) of thesample, such as antibodies.
 2. The method of claim 1, further comprisingcharacterizing the sample from the relative binding of compounds whosebinding to the sample would be difficult to distinguish from each otherand nonspecific binding under conditions of monovalent binding but whichshow significantly different binding from each other and nonspecificbinding due to multivalent binding of the sample to the compoundsthereby generating a binding profile characteristic of the sample. 3.The method of claim 1, wherein the average spacing between differentmolecules of a compound in an area of the array is less than 6 nm. 4.The method of claim 1, further comprising characterizing the sample fromthe relative binding a plurality of the compounds binding to the samplewith association constants of 1 mm to 1 μM.
 5. The method of claim 4,wherein the plurality of compounds includes at least 10 or 100 compoundsbinding to the sample with association constants of 1 mm to 1 μM.
 6. Themethod of claim 5, wherein the characterizing comprises comparing abinding profile of the sample that includes the relative binding of theplurality of compounds with a reference binding profile.
 7. The methodof claim 1, further comprising identifying a component of the samplethat binds to the different compounds.
 8. The method of claim 6, furthercomprising detecting the identified component with a binding partnerknown to bind the component.
 9. The method of claim 7, wherein theidentified component is detected with a plurality of different bindingpartners known to bind the component.
 10. The method of claim 7, whereinthe binding partner is an antibody to the identified component or apeptide known to bind the identified component.
 11. The method of claim7, wherein the binding partner is one of the different compoundsdetected in step (b).
 12. The method of claim 7, wherein the bindingpartner is a synbody.
 13. The method of claim 7, wherein the bindingpartner is immobilized to a support.
 14. The method of claim 13, whereinthe binding partner is immobilized to a support in an array.
 15. Themethod of claim 14, further comprising forming a second array, thesecond array containing one or more of the different compounds in thearray binding to identified component not all of the different compoundsin the array.
 16. The method of claim 1, wherein the second arraycontains less than 5% of the different compounds in the array.
 17. Themethod of claim 1, further comprising forming an array or other devicecomprising compounds determined to bind to the sample but not all of thedifferent compounds in the array.
 18. The method of claim 1, wherein thedetecting step detects binding of the different compounds to an antibodyor antibodies in the sample.
 19. The method of claim 1, wherein thedetecting step detects binding of the different compounds to abiological entity displaying multiple copies of a protein from its outersurface.
 20. The method of claim 1, wherein the biological entity iscell displaying multiple copies of receptor from its outer surface. 21.The method of claim 1, wherein the compound having strongest binding tothe sample binds to the sample with a dissociation constant of 1 mM to 1μM.
 22. The method of claim 1, wherein the compounds are peptides orsmall molecules.
 23. The method of claim 1, wherein the array has500-50,000 peptides.
 24. The method of claim 23, wherein the peptidesare 10-30 amino acid long.
 25. The method of claim 24, wherein thesequences of the peptides are randomly selected.
 26. The method of claim1, wherein the different immobilized compounds are selected withoutregard to the sample and the array further comprises a plurality ofcompounds known to bind different proteins also occupying differentareas of the array.
 27. The method of claim 26, wherein the plurality ofcompounds known to bind different proteins includes compounds known tobind at least 25%, 50% or 75% of different human proteins.
 28. Themethod of claim 27, wherein the different immobilized compoundsincluding a plurality of compounds known to bind at least 25%, 50% or75% of different human proteins and 500-50,000 random peptides.
 29. Themethod of claim 23, wherein the sequences of the peptides have less than90% sequence identity to a known binding partner of the target.
 30. Themethod of claim 23, wherein the sequences of the peptides have less than90% sequence identity to known proteins.
 31. The method of claim 1,wherein the contacting of the sample to the array is performed in thepresence of a potential competitor of binding of the sample to thearray.
 32. The method of claim 31, wherein the competitor is a knownbinding partner of a suspected component of the sample.
 33. The methodof claim 1, wherein the sample is a patient sample, and the competitoris a protein known to be associated with a disease affecting thepatient.
 34. The method of claim 1, wherein the sample is a patientsample.
 35. The method of claim 1, wherein the sample contains aplurality of antibodies.
 36. The method of claim 34, wherein the patientis known or suspected to be suffering from a disease.
 37. The method ofclaim 34, wherein the patient is known to be at risk of a disease but isnot showing symptoms of the disease.
 38. The method of claim 36, whereinthe disease is an autoimmune disease.
 39. The method of claim 36,wherein the disease is an infectious disease.
 40. The method of claim36, wherein the disease is of the CNS.
 41. The method of claim 33,wherein the sample is a blood, urine, or CNS sample.
 42. The method ofclaim 1, wherein component(s) of the sample are labeled.
 43. The methodof claim 1, wherein binding of the peptides to component(s) of thesample is detected using a secondary antibody.
 44. The method of claim43, wherein the secondary antibody is an isotype-specific antibody. 45.The method of claim 1, wherein binding of the peptides to component(s)of the sample is detected by spr or mass spectrometry.
 46. The method ofclaim 1, further comprising affinity purifying a component of the sampleusing a peptide determined to bind to the sample.
 47. The method ofclaim 1, further comprising washing unbound component(s) of the samplefrom the array, and dissociating bound component(s) from the array. 48.The method of claim 47, further comprising preparing an antibody libraryfrom the patient and using a peptide to which an antibody in the samplebinds as an affinity reagent to screen the library.
 49. The method ofclaim 48, further comprising identifying a natural binding partner ofthe affinity purified antibody.
 50. The method of claim 23, furthercomprising comparing the sequence(s) of peptide(s) binding to thecomponent(s) of the sample to a database of natural sequences toidentify natural binding partner(s) of components of the sample.
 51. Themethod of claim 50, further comprising comparing a profile of differentcompounds binding to the sample with profiles of the different compoundsassociated with different diseases or different stages of a disease todiagnose a patient as having one of the diseases or stages of disease.52. The method of claim 50, further comprising comparing a profile ofdifferent compounds binding to the sample with a profile of thedifferent compounds associated with lack of a disease to determinewhether a disease is present.
 53. The method of claim 50, furthercomprising repeating the method for different samples from a pluralityof patients with the same disease to develop a binding profilecharacteristic of the disease.
 54. The method of claim 50, furthercomprising repeating the method for different samples from a pluralityof patients with different disease to develop a plurality of bindingprofiles characteristic of different diseases.
 55. An array ofimmobilized different compounds occupying different areas of the array,wherein different molecules of the same compound within an area arespaced sufficiently proximate to one another for multivalent bindingbetween at least two of the different molecules in the same area and amultivalent binding partner. 56.-72. (canceled)
 73. A method ofmanufacturing a device for use in detecting a component of a samplecomprising; (a) contacting the sample with an array of immobilizeddifferent compounds occupying different areas of the array, whereindifferent molecules of the same compound within an area are spacedsufficiently proximate to one another for multivalent binding between atleast two of the different molecules in the same area and a multivalentbinding partner; and (b) detecting binding of the different compounds inthe array to component(s) of the sample. (c) forming a device includinga known binding partner of a component to which binding of the differentcompounds is detected in step (b). 74.-83. (canceled)
 84. An array ofimmobilized different compounds occupying different areas of the array,wherein the different compounds include a plurality of compounds knownto bind at least 25, 50 or 75% of the known human proteins and500-50,000 random peptides, wherein different molecules of the samepeptide within an area are spaced sufficiently proximate to one anotherfor multivalent binding between at least two of the different moleculesin the same area and a multivalent binding partner.
 85. A method oftesting a vaccine, comprising contacting a blood sample of a subjectimmunized with a vaccine against a pathogenic microorganism with anarray of immobilized different compounds occupying different areas ofthe array; detecting a pattern of binding of the sample to the differentcompounds in the array; and comparing the pattern of binding to thepattern of binding of one or more reference samples, wherein thereference samples are from subjects who have survived an infection withthe virus, similarity of binding profile between the subject andreference samples providing an indication the vaccine is effectiveagainst the pathogenic microorganism.
 86. (canceled)